Vol. 28, No. 2 April, 1944 


Journal of Applied Psychology 


EDITED BY: DONALD G. PATERSON, UNIVERSITY OF MINNESOTA 


Consulting Editors 


Pau S. AcuILies, Psychological Corporation; WALTER V. BINGHAM, A.G.O., War Department; 
Harotp E. Burtt, Ohio State University; Antuur I. Gates, T. C. Columbia University; 
Joun G. JENKINS, University of Maryland; Irvinc Lorce, T. C. Columbia University; 
Quinn McCNEMAR, Stanford University; WILLARD C. OLSON, University of Michigan; 
James P. Porter, Ohio University; EDwarp K. STRONG, JR., Stanford University; 
Morris S. VITELES, University of Pennsyluania; JoserH Zusin, N. Y. Psychiatric Institute. 





Table of Contents 


Studies in Phonographic Recordings of Verbal Material: IV. Written Reports 
of Interviews: B. J. Co 
An Objective Test of Vocational Interests: H. J. OLpER 
Road Accidents: Pedestrians’ Beliefs Regarding Visibility at Night: 
H. H. Fercuson 
Legibility of Newspaper Headlines Printed in Capitals and in Lower Case: 
K. BReLtanp anv M. K. BreLanp 
A Comparison of Norms for the Minnesota Rate of Manipulation Test: 
' J. TuckMan 
Comparison of an “Industrial” Problem Solving Task and an Assembiy Task: 
J. T. Rusmore 
Note on Use of Pre-Test Practice Periods by Typist-Clerks: M. P. MANson 
Attitudes of Elementary School Children to School, Teachers, and Classmates: 
S. TENENBAUM 
An Evaluation of Word and Picture Tests for First and Second Grades: 
F. Poston 
The Minnesota Multiphasic Personality Inventory: V. Hysteria, Hypomania 
and Psychopathic Deviate: J. C. McKintey anv S. R. HatHaway ... 
A Note on the Clinical Use of the Hunt-Minnesota Test Bag Organic Brain 
Damage: H. F. Hunt 
A Reply to Dr. Donald E. Super: J. G. Dartey 
Book Reviews 
New Books, Monographs, and Pamphlets 





Published Bi-monthly by The American Psychological Association, Inc. 
With the Cooperation of The American Association for Applied Psychology 
Prince and Lemon Sts., Lancaster, Pa., and Northwestern University, Evanston, Illinois 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the Act of March 3, 1879 
Copyright, 1944, by The American Psychological Association, Inc. 




















ae 


> 


ae > ee 








a Oe 





Journal of Applied Psychology 








Vol. 28, No. 2 April, 1944 








Studies in Phonographic Recordings of Verbal Material: 
IV. Written Reports of Interviews * 


Bernard J. Covner 
University of Pennsylvania 


In a previous article by the writer an investigation was described in 
which the completeness and accuracy of counseling interview reports were 
determined by comparing the reports with phonographic recordings of the 
corresponding interviews.! Except for the fact that the counselors were 
aware that their interviews were recorded, and for the fact that the reports 
were written immediately after the interviews, the study was conducted 
under usual, on-the-job conditions existing in a counseling agency. It 
was found that regardless of the extent of training and experience of the 
counselor, his written accounts of therapeutic contacts, at best, had serious 
limitations for use in training and research, as well as in the handling of 
the case itself. Although most of the material found'in the reports was 
accurate (75-95%), so much of the actual interview material (>70%), 
important and unimportant alike, was omitted from them that they gave 
a somewhat distorted picture of the contents of the original interview. 
In short, in most instances the interview report was a poor substitute for 
the typewritten transcription or “typescript” of the phonographic 
recording. ° 

In the present paper some of the qualitative findings of this study will 
be illustrated, and their implications for counseling and psychotherapy 
discussed. In addition, ways in which the social worker, psychiatrist, and 
psychologist may improve written accounts of interview contacts with 
their clients will be indicated. 


* This article is a condensation of part of the writer’s doctoral thesis written at Ohio 
State University under Professor Carl R. Rogers. The research on which it was based 
was a unit of a coordinated program of research on the interviewing process subsidized 
by the Graduate School of Ohio State University. 

1Covner, B. J. Studies in phonographic recordings of verbal material: III. The 
completeness and accuracy of counseling interview reports. Accepted for publication 
in the J. General Psychol. 


89 














Bernard J. Covner 





Illustrations 


As mentioned above, one of the major findings of the study in question 
was that most of the verbal action of the interview was not included in the 
counselor’s report. An illustration of this appears below in sections taken 
from a typescript and its corresponding report. This material at once 
demonstrates an application of the system used in quantifying the data of 


the investigation,’ as well as the nature of a counselor’s account of one of 
his interviews. 


On the right side of the page appears the typescript excerpt, with num- 
bers indicating the classification of the counselor and client items appear- 
ing in separate columns in the right-hand margin. Thus the counselor 
item, ‘“You would very much like for me to accept you, wouldn’t you?,”— 
an interpretation on the part of the counselor—is classified ‘‘21” in the 
right-hand margin. Similarly, the client item, “Why do you think I 
think I’m hot stuff?,”—a relevant question by the client—is classified 
“1” in the right-hand margin. 

On the left side of the page are found two sections from the report 
which correspond to two of the given typescript items, the only items in 
the excerpt which were included in the report. These inclusions, it will 
be noticed, accurately represent not only the material to which the arrows 
point, but other material in the interview as well. 

Such wholesale omission of mention in the report of the remaining 
typescript items, deprives the individual who reads the report of much 
highly significant and interesting material. Just as the hard-hitting 
tactics of the counselor and the resulting feelings of the client are withheld 
from this particular report, so was much equally important material with- 
held from most of the reports studied. As indicated by the excerpts 
below, how different are the pictures of the interview one obtains, depend- 
ing upon whether he reads the typescript or the report! 


Report Typescript C § 
S.—Why do you think I think 
I’m hot stuff? 1 


C.—John, the only thing I’m 
trying to do is get you to 
recognize how you think 


about yourself. 21 
. . . on about four different oc- — S.—Well, I just wanted to 
casions 8S. asked C. if he thought know—do I give that im- 
he were egotistical. (Accurate pression? 1 


descriptive condensation) 


2 A condensed description of this system may be found in the article cited in foot- 
note 1. 











ee ————— 








Phonographic Recordings of Verbal Material: IV 91 


C.—avoided the question, — C.—yYou would very much like 
turned it back toS.... for me to accept you, 
pointed out that he just wanted wouldn’t you? 3 


C. to accept the responsibility 
for his feelings. (Accurate de- 
scriptive condensation) 


S.—Oh I don’t—yeah. I just 
wanted to know, I would 
like to know how I impress 
people. 3c 


C.—John, you are more con- 
cerned about that than 
anything. 3 


S.—Yeah. / And the only way 3a 
to find out is to ask them, 
and it makes you feel like 
- a caeeet to stop some- 
— in the hall and say, 6 
OW at do you think of 
me?” / Oh I really don’t 
like to talk about myself, 
in fact, I’d rather stay off 


it. 6 
C.—Gets pretty uncomfortable 
at times, doesn’t it? 3 
S.—Sure does. 3a 


Note: C.—Counselor item; 8.—Client (subject) item; 1. Relevant question; 
3. Interpretation, clarification of feelings; 3a. Acceptance of interpretation 
or clarification; 3c. Lack of clearness re acceptance or refusal of interpreta- 
tion or clarification; 6. Straightforward description; 21. Explanation of 
purpose of counseling. 


A rather poignant example of what goes on behind the closed doors of 
the counseling room and never gets into the counselor’s report is revealed 
in the following typescript excerpt. The counselor has advised the client 
to visit a University dean concerning entrance requirements to a particu- 
lar college. The action continues as follows: 


C.—Incidentally he has a very good-looking secretary, very good. S.—He 
does? Is she young? 
C.—She is young. S.—Well, I do declare. 


C.—So if you just went over there to look at the secretary, you wouldn’t 
be wasting your time. S.—‘Beauty is where you find it, my boy.” 


C.—Well, I would say, “Approach beauty.” S.—(laugh) Say, you didn’t 
fall for her, did you? 


C.—No, no, nothing like that. Not a married man like me. §S.—Are you 
married? 


C.—Sure. 8.—Well, I'll be darned! I’m glad I discovered it. 

















92 Bernard J. Covner 


Digressive material of this nature continues for several pages. The 
only mention of the “referral’’ in the counselor’s report reads as follows: 
“Also urged him to see Dean—as to entrance requirements, etc.’”’ Thus, 
not only does this example reveal that the counselor’s terse account of the 
referral literally covered a multitude of sins, but it also indicates how little 
the counselor was disturbed by the presence of the microphone which, in 
the guise of a goose-neck lamp, rested on the interviewing table. 

A rather serious limitation of interview reports found in this investiga- 
tion and not dealt with quantitatively was concerned with the chronologi- 
cal sequence of events appearing in the report. In general, there was 
observed a tendency for the chronological sequence of report sections to 
parallel the time sequence of typescript items. Close inspection, how- 
ever, revealed many instances of alterations in time sequence. 

These tendencies to progress and regress in time sequence are illus- 
trated below in an excerpt from a written report. The numerals appear- 
ing at the beginning of each section indicate the order number of the 
typescript item to which the report section in question corresponds. The 
presence of more than one order number at the beginning of a section 
means that the section referred to more than one typescript section. 


(32, 36) H. replied by telling of several intelligence tests that he had taken, / 
(32) that he knew that he did very well, / (32) that with exception of one or two 
boys he was the brightest boy in the math class, / (50) but that he really was 
worried about his real ability to do things. / (55, 71) C. suggested that he bol- 
stered his confidence in himself by reference to his intelligence rating / (71) but 
he really felt inferior to other people. / (72) This H. rejected rather violently. 
/ (63, 113, 115) C. said that what H. was really worried about was not being ac- 
cepted by the people with whom he prendre / (64, 114, 116) and though there 
was some acceptance of this ne there was evident a tendency to blame 
other people for not accepting him. / (51) C. asked where the responsibility for 
his not getting along should be placed. / (76) Then followed a long dis- 
cussion. ... 


Changes in chronological sequence of the sort illustrated above some- 
what limit the individual who uses the inter iew report to establish tech- 
niques of counseling. If the order numbers were not present in the above 
excerpt there would undoubtedly be a tendency for such an individual to 
assume direct causal connections between successive sections. The order 
numbers, of course, indicate that this is definitely not always the case, and 
that in writing up his report the counselor has a tendency to tie events up 
in an order different from that in which they originally occurred. 

Lack of precision in the manner in which many of the counselors wrote 
up their interviews considerably reduced the clarity of their reports. By 
failing to indicate who was responsible for a particular statement, or 
whether a statement was descriptive or inferential, the counselor’s efforts 
to give an outsider who might read his report a clear picture of what took 














Phonographic Recordings of Verbal Material: IV 93 


place in the interview were substantially hampered. The following ex- 
cerpts from reports illustrate this lack of precision in writing, and in 
reading them one is forced to ask whether they are descriptions or 
inferences: 


1. Has little occasion where outlining seems natural. 
2. Jack’s headaches are concerning him less and less. 


3. Mary’s grades for the mid-quarter show some improvement but still are 
unsatisfactory to her. 


4. He realizes that he needs work in grammar before he can do well in an 
English composition course but he does not want to take English 400 
because he will not receive college credit for it. 


How easy it would have been for the counselors to have prefaced their 
remarks in these instances by “He said,” “It seemed to me,” or “I said,” 
and how much more revealing their statements would then have been! 

Instances of ambiguities such as the following also detracted from the 
clarity of reports: 


1. He also tells me that he has found an agricultural fraternity which should 
help his social participation and adjustment. 


When one reads such a statement, he is inclined to ask: (1) Was that 
part of the statement concerning aid to social participation and adjust- 
ment uttered by the client, or was it inferred by the counselor? (2) Is the 
adjustment due to come about merely because an agricultural fraternity 
was found, or because some particular agricultural fraternity was found? 


2. Had, as had been suggested, made out cards for zoology listing character- 
istics of various species of life. Good. 


A good series of cards? A good idea that she did it? 

Let us conclude this section by presenting unedited and in their 
entirety one of the worst and one of the best interview reports examined 
in this investigation. The accuracy score * of the first was 2.6 out of a 
possible 100, and that of the second, 30.8. Just as the first does not 
incorporate all that is bad in interview report writing, neither does the 
second epitomize all that is good. They are merely presented to illustrate 
what may be encountered in the way of quality and quantity in counseling 
interview reports, and without further comment are left to speak for 
themselves. 


1. Went over again with Art the fundamentals of reading a text-book. Is 
apparently making some, but not considerable progress. 


total no. t ript items accurately 
included in counselor’s report 


3 => 
Accuracy score total no. typescript items 





x 100. 

















Bernard J. Covner 


Discussed time schedule. Art sees that he is putting in too few study 
hours and will begin to study in the afternoon. 


Art has started working on his spelling and grammar deficiencies. 
Art just failed a history exam. and will probably begin to study harder. 


. Began by discussing the economics midterm for which he had studied so 
hard. S. expressed the opinion that he had misjudged what the exam. 
would be like—was looking for tricks in the exam. as he had found often 
happened in his Psychology Course. However, the exam. was “straight 
stuff.” Said he was little disappointed that he had only gotten a ‘‘C”’ 
but hastened to explain that no ‘“‘A’s” were given, only 6 “‘B’s” and his 
“C” was only one point under a “‘B’’; C. interpreted that 8S. felt like he 
had done pretty well after all; S. accepted this and explained that he felt 
like he actually knew as much as the fellows who made “B’s.” 


S. felt he would have another chance when the essay part of the exam. is 
to be given this week. Instructor announced that it would be taken from 

uestions at end of chapters and S. said he would do all right on any of 
those questions with a little review. C. interpreted that S. felt confident 
of being able to answer the essay questions because of his intensive review 
of the chapter questions last week. 8. agreed on this. 


8. stated that he seemed to be making a “C —”’ in all his courses and hoped 
to pull some of them up to “‘B’s” by end of quarter. C. pointed out that 
S. was inning to feel he had academic problems under control. S§S. 
accepted (pause). 


8. explained incidents of weekend during which his girl lost a valuable 
sorority pin and efforts he had gone to in re to find it. C. inter- 
preted S. felt responsible for loss but 8. corrected by saying in effect he 
felt concerned about it and responsible for helping her recover it. 


After a pause S. stated that he was completely through with his job and 
without stopping went on to say that he was going to do an extra library 
assignment in 411 some afternoon between 4 and 6. Just thought he’d 
waste some time. 


C. interpreted that S. was having some trouble finding things to do during 
the time he had usually worked. S. did not accept this but said he has 
been very busy in the mornings at the police station, newspapers, and 
fraternity houses, trying to get a line on the missing sorority pin. How- 
ever, next quarter he would have enough todo. He thought he’d go over 
to Commerce Library and read up on corporations and maybe write a 
paper on stocks and bonds, etc. First said that he’d like to get some 
dope and then play the market but went on to say his Dad would be 
interested in this. C. interpreted that his Dad would be pleased to know 
he had done this and 8. would like to please him. S. accepted this. 
C. feels that S. is not no HI agreed on the advisability of having 

iven up his job and somewhat reminds C. that C. was responsible for it. 

owever, he does see there are things he can profitably do during this 
time that was formerly spent in work. 


He still has some regrets about giving up the job but he will take the 
responsibility for it if he hasto. Grudgingly admits that there are things 
he can do during that time. 


After twenty minutes and at end of a pause S. said he guessed that was 
all he had to talk about today. C. remarked that their agreement was 








Ne 


Phonographic Recordings of Verbal Material: IV 95 


to - whenever §. felt there was no more to discuss and that if some day 
he had nothing he wanted to talk about we would part at the beginning 
of the period. It was up to him. 


Then came a spontaneous, self-initiated expression of appreciation of the 
benefits that had been derived from these conferences. g stated that he 
felt free to discuss his problems in ‘‘C’s” presence, that he could see things 
he wouldn’t admit to himself. Felt that the whole thing had been very 
helpful. This was so unexpected and made C. feel so good that he ex- 
pressed his feelings on the matter by saying ‘“‘That makes me feel good 
too.” 


Some Implications and Suggestions 


The question now arises as to what bearing the above-illustrated find- 
ings have upon the evaluation of interview reports written under circum- 
stances different from those of the present study. How would one 
evaluate, for instance, the reports written under usual, on-the-job condi- 
tions in a child guidance clinic, social work agency, or psychiatric service 
center? There are two aspects of the latter situations which would tend 
to make the reports written in them slightly inferior, perhaps, to the best 
of those written for the present study. The first is that in such agencies 
the counselor frequently does not write his interview reports immediately 
after conducting an interview. Instead, activities such as further inter- 
viewing, testing, and so on, are interpolated in the period falling between 
the cessation of the interview and the writing of the report. The second 
aspect is that such reports are not written with the knowledge that the 
interviews have been recorded and that the reports are to be carefully 
examined for completeness and accuracy, as were those of the present 
study. 

It is by no means to be implied that interview reports as they are 
usually found in agencies which do interviewing, or in the literature on 
counseling and psychotherapy, are of no value. . Much to the contrary. 
From them considerable valuable information has been obtained and 
important research has originated. The fact, however, that such reports 
tend to reveal so little of what takes place in the actual interview, and fail 
to present events in their correct chronological sequence, indicates that 
we would be farther ahead in our present counseling techniques if more 
adequate accounts of counseling contacts had been available in the past. 

Does this mean that interview report writing should be discontinued, 
and phonographic recording substituted? The expense and annoying 
complications involved in the phonographic recording and transcribing of 
interviews‘ unfortunately prevent such a substitution. What, then, 





‘For details, see Covner, B. J. Studies in phonographic recordings of verbal 

material: I. The use of phonographic recordings in counseling practice and research; 

II. A device for transcribing phonographic recordings of verbal material. J. Consult. 
Psychol., 1942, 6, 105-113 and 149-153. 











eo ee A ET 
a rpnneranashipeubansninatnenetnenensrvan.sthos ite cere Ana 


cee: Nepean MOO 


96 Bernard J. Covner 


should be the future policies concerning the recording of interview 
activities? 

If interview material is to be used for research or for instruction pur- 
poses it should, wherever possible, be phonographically recorded and 
transcribed into typescript form. Such material makes one’s arguments 
highly convincing and gives a realistic picture of the nature of interview- 
ing. For students who are learning to interview it is most helpful for 
their early interviews to be recorded and transcribed. By listening to 
their work as it is played back to them, and by studying their typescripts 
carefully, especially with the aid of a supervisor, they can correct many 
mistakes and faulty habits at an early date. Later, interviews should 
occasionally be recorded in order to determine the amount of progress 
made as well as to discover and to eliminate other errors. Such an appli- 
cation of the phonographic recording technique should make for faster, 
more efficient learning and ultimately more effective interviewing.® 

In instances where it is not especially desirable or possible to make 
phonographic recordings of interviews, but it is necessary that a detailed 
written account of the interview be kept, report writing must be resorted 
to. Several findings of the present investigation point to ways in which 
such reports may be considerably improved. 

Most important of all, perhaps, for the writing of good reports, is that 
in the process of training the would-be interviewer obtain a clear under- 
standing of that type of interviewing which he plans to practice. He 
should know what his particular functions as an interviewer are, and 
should have a generalized framework onto which the actions of the inter- 
view can be attached, and by which the interview can be described. 
Actual experience in studying phonographic recordings and typescripts 
should foster the development of these concepts. 

Although he has but little evidence upon which to base his claim, it 
appears to the writer that next in importance is the taking of some sort of 
notes during the interview itself. The question of note-taking during 
interviews has generally been one on which there has been considerable 
discussion pro and con, but also one on which no research has been done. 
One of the commonest arguments against note-taking is that it will dis- 
turb the client. This is true in many instances, but it is also true that if 
tactfully handled there are many instances in which the client is not 
disturbed by note-taking. 

Another argument against note-taking is that the interviewer will 
become so absorbed in his notes that he will not be able to do as good a 


* For further discussion, see Rogers, C. R. The use of electrically recorded inter- 
views in improving psychotherapeutic techniques. Amer. J. Orthopsychiat., 1942, 12, 
429-435. 














ee 





a 


/ 
| 
| 
: 
} 
| 
| 


Phonographic Recordings of Verbal Material: IV 97 


job of interviewing as might otherwise be possible. This is probably true 
in some instances, but it also depends in some measure upon the experi- 
ence of the interviewer and upon the particular type of notes which he 
takes. 

If a person has no knowledge of shorthand, it is not necessary that he 
try to take down in longhand every word spoken in the interview. In- 
stead it seems desirable that he jot down wherever advisable and con- 
venient a few words or phrases to indicate the chronological sequence of 
interview events. Such a skeleton of topics covered, written in brief 
phrases, or in some personal code of the interviewer himself, will help him 
to reconstruct the interview at a subsequent period. Here it is extremely 
important that a record be made of both interviewer and interviewee 
responses. ‘This is especially important in interviews in which a standard 
series of topics is not covered, and in interviews of the standard topic 
type as well when the material gets off the regular track. 

The form and content of the report written from the interviewer’s 
notes will depend in large measure upon the use to which the material is 
put and upon the exigencies of the situation. Thus, in a counseling inter- 
view, it is well for the counselor to include in his report every word of 
the interview that he can possibly recall. Much material, apparently 
quite trivial when examined out of context, is highly significant in that it 
tells how the interview is going, whether the counselor is encountering 
resistance, and so on. It is almost needless to say that in writing up a 
counseling interview it is just as important for counselor responses as for 
client responses to be recorded. 

For other types of interviews, such as attitude survey interviews and 
employment interviews where standard questions are used, and where the 
chief function is one of information-gathering, it is less important for all 
of the conversation to be recorded. In such interviews much of the mate- 
tial that is irrelevant to the situation may be omitted from the report 
without harm. During such interviews, however, the interviewer will 
often depart from the standard question and ask one of slightly different 
wording, or one that doesn’t appear on the prepared form at all. In such 
instances, if the interview question form is to be evaluated and if the 
interviewer wants to be sure that he has a record of the response made by 
the interviewee, it is important that some indication of the change in 
procedure be made in his notes. 

Another practice that will make for increased accuracy of reporting is 
the use of the first person wherever possible. In a recording of material 
exactly as it is spoken, much of the reality of the original interview is 
preserved and there is less opportunity to confuse descriptive material 
with inferences. Further improvement can be obtained by underlining 














98 Bernard J. Covner 


words to indicate the emphasis originally given them. The use of 
parenthetical remarks to indicate the interviewer’s feelings at the time, 
or to indicate other important aspects of the situation will also do much 
toward making the interviewer’s account realistic. 

The use of the over-all item or over-all inference which gives informa- 
tion relating to the interview as a whole is also a practice which increases 
the scope and effectiveness of the written report considerably. Examples 
of this practice, employed by all too few of our counselors, appear below. 


Notice how much of what happened in the interview is revealed in a few 
words and phrases: 


1. She talked freely and easily. 

2. He seems interested and appreciative of any help. 

3. She seemed to know her main difficulty without looking at the test results. 
4 


. During the interviews he often talks around the point without coming to 
it. It is also noticeable that he talks about extraneous and unimportant 


material until the end of the interview when he seems to be getting down 
to the real problem. 


5. He seemed unwilling to take a frank look at his situation. He verbalizes 
a desire to find the answers to his problem and occasionally leads up to 


them, but when it gets right at the point where insight might occur, he 
seems to shy away. 


6. It seemed that all through the interview she was trying to get over the 
fact to the C. that she was pretty fed up with what we were doing at the 
clinic, and that she thought we could get at things much better by follow- 
ing some of her suggestions or some of Mrs. F.’s suggestions. 


7. The interview seemed to be slipping away without much to go on for 
next time. 


8. It appeared, too, that she had come to the clinic for the express purpose 
of hurriedly presenting this idea and that she was convinced that the 
problem was not one which concerned the home or herself. 


Where material not pertaining to the interview in question, such as a 
reference to an earlier interview or to an item in a case history, is included 
in the report, some indication should be given of its origin. This will 
prevent the reader as well as the report writer himself from later confusing 
it with the material of the interview in question. 

Last but not least is it important to consider the question of grammar. 
There is no gainsaying the fact that wherever the report writer is not 
attempting to reproduce the exact wording of the interview, which is 
often ungrammatical, the adherence to rules of grammar and punctuation 
will reduce the chances of making material ambiguous and enhance the 
general clarity and usefulness of the report. 


Received February 24, 1943. 





EE 


— 





ae eR PN 


a 











An Objective Test of Vocational Interests * 
Lieut. (j.g.) Harry J. Older, H-V(S) USNR 


Many devices have been used in the attempted measurement of inter- 
ests. The interest inventory is the most important of these both from 
the standpoint of the number of counselors using them and the number of 
investigators working with them. The inventory approach consists of the 
comparison of the likes and dislikes of individuals through questionnaire 
items. Since the individual is asked to estimate his feeling, the method 
may be said to be subjective. A complete discussion of interest inven- 
tories is given by Fryer (4). 

Perhaps owing to this emphasis upon subjective inventories, attempts 
to measure interests objectively have been somewhat rare. Efforts in 
this direction usually have been complicated by the difficulty of develop- 
ing a technique which would measure interest without at the same time 
measuring intelligence, previous information, or abilities of one kind or 
another. Super and Roper (10), Roper (6), Bernstein (1) and Stead (7), 
report the results of a series of investigations which indicate that a method 
which avoids this difficulty is available. 

Briefly, the method was as follows: The subjects were shown a series 
of pictures which were projected onto a screen. These pictures repre- 
sented occupational activities. After seeing the pictures the subjects 
were asked to take a test covering the material they had seen. The 
assumption was made that those items which stimulated the most interest 
would be remembered best. 


The following general findings were reported: 


1. Scores on the tests showed low correlations with intelligence test scores 
and revealed little dependence on past experience. This was interpreted as 
indicating that, since the tests were not measuring intelligence or past experi- 
ence, they must be measuring interest in the activities represented. 


* This study was conducted in the Division of Psychology of Clark University. 
The author takes this opportunity to thank Dr. Donald E. Super who suggested the 
problem and directed the research. A more complete presentation of results, with all 
statistical tables, may be found in the original study, “An objective test of vocational 
interests.” Unpublished Doctor’s Dissertation, Clark University Library, Worcester, 
Mass. 

The opinions or assertions in this paper are the private ones of the author and are 
not to be construed as official or reflecting the views of the Navy Department or the 
Naval Service at large. 


99 








100 Harry J. Older 


2. Scores on the tests showed low correlations with scores on the Strong 
Vocational Interest Blank. In some instances, the scores were found to be 
more useful in distinguishing among various groups than were the scores on the 
Strong Vocational Interest Blank. 


The significance of these findings is both theoretical and practical. 
On the theoretical side, as many authors have pointed out, the inventory 
approach is an indirect one and results in an estimate of the closeness with 
which the likes and dislikes of one individual agree with those of others. 
A more valid estimate of the individual’s interest would be obtained if 
measurements were more straightforward and direct. The ideal method 
would be the measurement of the manifest interests of the individual. 
These studies hold promise of furnishing the basis for such an approach. 
The practical significance lies in finding an objective measure of interest 
which does not exceed the subjective instruments in testing time or in cost. 

The present investigation is a continuation of the research begun by 
the above authors. In addition, an attempt is made to represent all pro- 
fessions in one test. The aim is not primarily the development of an 
instrument, but the more complete investigation of a technique. It is 
hoped that as a result of this investigation something approximating a 
practical instrument may be made more nearly a reality. 


Method 


The general method followed in the present study was similar to that 
reported by Super and Roper (10), Roper (6), Bernstein (1) and Stead (7). 
Several important changes were, however, introduced. 

A film strip was constructed in which were represented activities in 
18 distinct occupations. These occupations were: (1) The Scientific Pro- 
fessions: a. Medicine, b. Dentistry, c. Chemistry, d. Engineering; (2) 
The Social Service Professions: a. Teaching, b. Social Work, c. The Min- 
istry; (3) The Literary and Legal Professions: a. Advertising, b. Jour- 
nalism, c. Non-Journalistic Writing, d. Law; (4) Clerical Activities; (5) 
Selling Activities: a. Wholesale, b. Retail; (6) The Trades: a. Metal 
Trades, b. Mining, c. Building Trades, d. Agriculture, e. Forestry. 

Carter and Jones (2), Dwyer (3), Strong (9), and Thurstone (11) using 
factorial analysis techniques, found that occupational interests grouped 
themselves into patterns similar to the above. Therefore, the 18 occupa- 
tions were grouped into these six occupational families for the sake of 
convenience. 

Care was taken to obtain pictures which would represent activities of 
each occupation adequately so that undue bias would not be given to any 
particular specialty. The artistic quality of the pictures was held con- 
stant insofar as possible. The total number of pictures was 120. It was 























Objective Test of Vocational Interests 101 


felt that this number would give adequate representation to each occupa- 
tional family. 

The tests used in the studies by Roper, Bernstein and Stead were made 
up of multiple choice, modified type true-false, and matching questions. 
It was thus possible to examine the tests which had been administered by 
them and to determine the type of questions which yielded the best results. 
The type of item found to be best was one that asked the subject to mark 
those items he thought were included in the film-strip, leaving blank those 
which he thought did not appear. It was decided to use the same type of 
item in the present study. 

In determining the number of items, an attempt was made to make the 
test long enough to be reliable without being impractical in administra- 
tion. Super and Roper (10) found that a test using approximately 200 
items was as long as practical administration problems would permit, and 
reported a reliability coefficient of .83 with this test. With these consider- 
ations, the number of questions decided upon was 240. Thus, there 
were 40 questions within each group of occupations. One hundred sixty- 
eight items were constructed to test memory of whole pictures. Seventy- 
two items dealt with parts of the pictures or with specific items which were 
included in the pictures. Half of the items, both those dealing with parts 
and with wholes, did not refer to pictures in the film-strip and were thus 
to be left blank. 

For example, after seeing the section on the medical profession, the 
subjects were asked to indicate whether or not the following scenes (as well 
as many others), or objects had been portrayed. 


1. Internes and nurse examining child; 2. Medical students in physi- 
ology laboratory, 3. Doctor and assistant placing emergency case 
in ambulance; 4. Doctor making examination before students; 
5. Research physicians experimenting with animals; 6. Surgical 
Oxygen Mask; 7. Stethoscope; 8. First Aid Equipment. 


Half of the items were those which had been shown in the film and half 
were not represented and were to be left blank in marking the test. 

Scores on the total test were calculated by the right minus wrong 
procedure. Scores were also calculated separately for each of the six 
occupational groups. The percentage of the total score contributed by 
each of the parts was then calculated. The occupational field in which 
a subject made his largest percentage of total score was considered to be 
the field in which he was most interested, regardless of the absolute size 
of the raw score. By this method, it was possible to rule out the factors 
of intelligence and general interest for purposes of comparison. 














102 Harry J. Older 


Procedure 


Subjects. The subjects were 336 high school students, 248 college 
students, and 59 business college students. All subjects were male. The 
high school subjects were all juniors and seniors, and were enrolled in all 
types of curricula. The only criterion of inclusion in the group was 
availability. 

Of the college students 196 were engineering students in the last two 
years of training, 52 were agriculture students. The total number of 
subjects was 643. 

The Testing Situation. The college students and the business college 
students received only the Vocational Interest Test. The film was shown 
to the subjects after a brief introduction by the examiner. During this 
introduction, the subjects were informed of the nature of the test and of its 
practical importance. The only specific instructions which preceded the 
showing of the film were, ‘““You are now going to see a series of film slides, 
in which are represented activities in several occupations. During the 
showing of the pictures you are merely to observe and take in what you 
see. Following the showing of the film, you will be asked to take a brief 
test covering the material in the film. Please pay close attention so that 
you will be able to make a good showing.” 

The film-strip was shown to all subjects under the same conditions 
insofar as possible. Seating arrangements allowing good visibility of the 
screen, and good facilities for taking the test were always provided. Each 
frame or picture was projected onto a screen for seven and one-half sec- 
onds. The showing of the pictures required about seventeen minutes. 

The test was given with no time limit but with instructions to work as 
rapidly as possible. Instructions for the test were self-explanatory, but 
wherever questions concerning the mechanics of the test arose they were 
answered. The entire test could be given and scored in approximately 
one hour. 

The high school students were given some supplementary tests. One 
hundred eighty-four of the high school subjects took not only the Voca- 
tional Interest Test, but the Otis Quick-Scoring Mental Ability Test, 
Gamma, Form AM, and the Strong Vocational Interest Blank for Men, 
Revised Edition. One hundred fifty-two took only the Vocational Inter- 
est Test and the Otis. 


Results 


Correlations with Intelligence. The correlation between the total 
scores made on the Vocational Interest Test and Otis I.Q.’s was found to 
be only .27 (N = 336). Correlations between the various-part scores on 
the Vocational Interest Test and Otis I.Q.’s ranged from —.23 to .22 


























Objective Test of Vocational Interests 103 


(N = 136). All of these coefficients are low, indicating that there is little 
likelihood that there is any real relationship between the Vocational 
Interest Test and intelligence. 

Absence of relationship between the scores on the Vocational Interest 
Test and intelligence does not, of course, throw direct light upon the 
validity of the method. It does, however, eliminate the possibility that 
the test is simply a measure of intelligence. As was previously pointed 
out, many of the attempts to measure interest objectively, particularly 
those using information tests, have been hampered by the relatively high 
correlations with intelligence. It would appear that this difficulty is 
avoided by the present method. 

Relationship to Previous Information. If it had been found that per— 
sons having a great deal of previous information concerning an occupation 
made higher scores on the relevant sections of the Vocational Interest Test 
than those having less information, this would have made this technique 
subject to the same criticism as information tests of interests. Haddad 
(5) has shown, however, that previous information in any given field has 
little relationship to scores on that part of the Vocational Interest Test 
under investigation. 

This finding is a somewhat indirect validation of the method as a 
whole. Coupled with the fact that there is little correlation between 
scores on this test and intelligence, these results make it logical to assume 
that the test is measuring interest. 

Correlations with the Strong Vocational Interest Blank. On a priori 
grounds it would seem that any two tests which measure interest, even 
though they use different techniques, should correlate with one another 
to some extent. Especially would this seem to be true where both tests 
distinguish between various groups of individuals with approximately 
similar efficiency. The correlation coefficients obtained between scores 
on the various sub-tests of the Vocational Interest Test and scores on 
comparable sections of the Strong Vocational Interest Blank ranged from 
.03 to .27. Scores on the clerical section of the Vocational Interest Test 
were correlated with those made on the clerical workers scoring key of 
the Strong Blank, the literary and legal professions score with the com- 
parable score on the Strong, and so on throughout the list of occupations. 

The correlation, or rather the lack of it, is not so surprising when con- 
sidered in the light of previously reported research with related techniques. 
Roper (6) found a correlation between a Test of Interest in Nursing and 
the Nursing Key of the Strong Blank of .003. Stead (7), using a Test of 
Interest in Clerical Work, found correlations with Strong scores which 
averaged .39. Correlations between scores on the Test of Interest in the 
Metal Trades and Strong Scores were found to average .19 by Bernstein 

















104 Harry J. Older 


(1). The method used by these investigators was closely related to that 
used here. Therefore, the correlations found here seem to be more in 
line with expectations based on experience than would be expected on 
a priori grounds. The fact that these correlations were to be expected 
does not, however, provide an explanation. 

Super and Roper (10), explaining the lack of correlation between scores 
obtained by the two methods, offer the following explanation: The Strong 
Blank measures the extent to which the likes and dislikes of an individual 
agree with those of other individuals, whereas tests such as the Vocational 
Interest Test measure the manifest or dynamic interests of the individual. 
They suggest that the difference between the two things being measured 
is such that no correlation need be assumed. 

They go on to point out the following analogy. Scores made on 
Strong’s key for YMCA secretaries will distinguish that group from the 
general population. Scores made on the Thorndike Intelligence Test will 
also distinguish them from the general population. Yet the correlation 
between the YMCA secretary scale of the Strong and the Thorndike 
Intelligence Test is .18. Clearly the two tests measure two’ different 
things. It is quite probable that the same explanation may be applied 
to the present problem. 

The Vocational Interest Test is an instrument which measures the 
dynamic vocational interests of the individual. The Strong Blank is one 
which measures communality of general likes and dislikes. Evidently 
these are independent functions. 

It is possible that a consideration of the two sets of scores, that is, 
those on the Strong Blank and those on the Vocational Interest Blank, 
might yield a picture which would make possible a more adequate estimate 
of an individual’s interests than mere consideration of one set of scores 
alone. This is, of course, only a conjecture subject to investigation. 

Relationship between Obtained Part Score and Theoretical Expectations. 
It will be remembered that the test was scored for six parts. Thus each 
individual obtained a score for each of the occupational fields represented 
in the film and in the test. The part of the test on which the individual’s 
highest score was made was considered to be that section in which his 
major interest lay. 

It was possible to set up hypotheses in advance in respect to the most 
probable section of the test on which any group would make its highest. 
mean score, then later to compare these hypotheses with obtained scores. 
This type of comparison among various groups was made whenever possi- 
ble. For example, it was to be expected that the group of business 
college students would make its highest mean score on the section of the 
test dealing with clerical activities. It was found that the mean score 




















——EE 





Objective Test of Vocational Interests 105 


on this section exceeded the mean score on every other section of the test 
by a margin which was in every case significant. In all, 54 comparisons 
among group mean scores were made. Out of these, 50 of the differences 
were in the expected direction, two were in the reverse of the expected 
direction, and two comparisons revealed no differences. Of these 50 
differences in the expected direction, 38 reached commonly accepted levels 
of statistical significance (D/od = 2.78 or above), or in the case of small 
samples, ¢ = less than one. The two differences which reversed expecta- 
tion were clearly insignificant. 

The evidence indicates, then, that the test is one on which a given 
group will almost invariably make its highest mean score on the section on 
which it might reasonably be expected to do so. 

On the whole, the results of this part of the investigation were clear 
and of a very promising nature. 

Ability to Distinguish among Groups of Individuals. It is in demon- 
strating the ability to distinguish among groups of individuals that the 
results are perhaps more crucial. If scores made on the various sections 
of the test make it possible to distinguish among criterion groups, then 
the method is one deserving consideration. If not, it holds little promise. 
In the previous section the problem was one of considering the section of 
the test upon which any single group would make its highest mean score. 
Here the problem is one of determining which of several groups will make 
the highest mean score on any given section of the test. 

This problem was investigated by comparing mean scores made on 
pertinent sections of the test by various groups. For example, the scores 
made on the clerical section of the test by a group of business college 
students, a group of would-be clerical workers from among the high school 
groups, a group of non-clerical workers (having no desire to enter the 
field), and a standard sample of high school students (each occupational 
ambition group equally represented) were compared. It was expected 
that the business college students would make a higher mean score on the 
clerical section of the test than any other group, and that the would-be 
clerical workers would rank next. 

In all, 21 comparisons of group means of the above type were made. 
Of these 21, nine differences were clearly significant, four of the remainder 
approached significance, and all but three were in the expected direction. 

From this evidence it would appear that the test has ability to dis- 
tinguish among the groups of subjects investigated. 

Relationship between Expressed Ambitions and Scores Made on the Test. 
The relationship between occupational choice and highest score made on 
the various parts of the test was calculated by means of the chi square 
technique. A x* of 19.47 with five degrees of freedom was obtained. 











106 Harry J. Older 


(N = 121.) This indicates that there is a considerable relationship be- 
tween expressed ambition and scores made on the test. 

Comparison of Results with those Obtained Using the Strong Blank. It 
is unfortunate that more comparisons between the Vocational Interest 
Test and the Strong Vocational Interest Blank were not possible. 

The results of the comparisons which were made between the Voca- 
tional Interest Test and the Strong Vocational Interest Blank must be 
regarded as somewhat tentative in nature, as they are all based on high 
school students and not upon occupational groups. They are valuable, 
however, in that they indicate the direction which further, more compre- 
hensive investigations may take. Another consideration in comparing 
the two tests is that the Vocational Interest Test is in an experimental 
form, whereas the Strong Blank is the result of many years’ careful 
investigation. 

Comparison of the ability of the two tests to distinguish among various 
groups showed that they are both good discriminators. But the compari- 
sons are all between groups classified on the basis of expressed occupational 
preference, and it is in this classification that the explanation for the 
occasional slight superiority of the Strong Blank over the Vocational 
Interest Test may lie. Steinmetz (8) has shown that superficial interest 
in any field can cause increase in scores on the Strong Blank. This is less 
likely to be the case with the Vocational Interest Test. Thus, it might be 
expected that the Strong Blank would yield scores which would dis- 
tinguish among various ambition groups with an efficiency which would 
exceed any objective instrument. The discriminating power of the Voca- 
tional Interest Test in this respect therefore seems to be quite good. 
The study of groups of employed men might yield results in which the 
Vocational Interest Test would show even higher distinguishing efficiency. 

Reliability. The odd-even reliability coefficients obtained on the 
various parts of the test ranged from .32 to .63. These are, of course, too 
low for individual) differential diagnosis. They are, however, probably 
high enough to warrant the previous interpretations of the results ob- 
tained with groups. In order to make the test useful for individual work 
a revision and extension based upon an item analysis would be necessary. 
There should result from this revision a test, the various parts of which 
would have reliabilities which compare favorably with those of other trait 
and ability tests. 


Summary and Conclusions 


The primary concern of this study was to continue the investigation of 
a new technique of interest measurement. The method was one which 
had shown promise of supplementing the inventory approach by measur- 

















Objective Test of Vocational Interests 107 


ing interest in the activities of an occupation instead of comparing the 
likes and dislikes of the individual with those of persons successful in a 
given occupation. 

Twenty pictures of activities in each of six occupational fields were 
shown to the subjects. After seeing the film they were given a test cover- 
ing the material seen in the film. The test consisted of 240 items of a 
modified true-false type. There were 40 questions on each occupational 
field. 

The relative interest of the individual was determined by comparing 
his score on each part of the test with his score on each of the other parts. 
The score was considered to indicate the field in which he was most 
interested. 

The subjects taking the test included 336 high school students, 248 
college students, and 59 business college students. Of the high school 
students taking the test, all 336 also took the Otis Quick Scoring Mental 
Ability Test, Gamma, Form AM and 184 also took the Strong Vocational 
Interest Blank for Men, Revised Edition. 

Major conclusions which may be drawn from the study are: 


1. As it is not a measure of intelligence or previous information, the 
most tenable hypothesis is that the test does measure interest. 

2. The inventory method measures the agreement of general likes and 
dislikes, while this technique seems to measure the dynamic vocational 
interests of the individual. 

3. Groups of subjects make their highest part scores on those sections 
of the test which agree with the theoretical prediction as to where they 
should make their highest scores, e.g. business college students make 
higher scores on the clerical section of the test than on any other section. 

4. The scores made on the various sections of this test distinguish 
relatively well among various groups of individuals. 

5. A close agreement was found between scores made on the parts of 
the test and expressed occupational preferences. 

6. The reliability of the sections of the test is not high enough to 
warrant confidence in the test for use in individual differential diagnosis. 
It is, however, adequate for use in investigations in which the problem 
concerns the differences among groups. 

7. Further problems which might be investigated are: 

a. A revision and extension of the film and test based upon item 
analysis; b. Administration of the test to a greater variety of subjects, 
particularly employed men, to determine more adequately the effective- 
ness of the test in distinguishing among vocational groups; c. Assuming 
the reliability to be improved by the revision of the test, follow-up studies 











108 Harry J. Older 


of individual high school cases should be made to determine effectiveness 
of prediction. 


Received May 13, 1948. 


10. 


11. 


Bibliography 


. Bernstein, B. A test of interest in the metal trades. Master’s thesis. Clark 


University, 1941. 


. Carter, H. D., and Jones, M. C. Vocational attitude patterns in high school 


students. J. educ. Psychol., 1938, 29, 321-324. 


. Dwyer, P.S. An analysis of nineteen occupational scores of the Strong Vocational 


Interest Blank. J. appl. Psychol., 1938, 22, 8-16. 


. Fryer, Douglas. The measurement of interests. New York: Henry Holt and Co., 


1931. 


. Haddad, W. C. The effect of previously acquired information on a memory test 


of interest. Unpublished Master’s Thesis, Clark University, 1942. 


. Roper, Sylvia. A test of interest in nursing. Master’s Thesis, Clark University, 


1941. 


. Stead, A.L. A test of interest in clerical work. Master’s Thesis, Clark University, 


1941. 


. Steinmetz, R.C. Measuring ability to fake occupational interest. J. appl. Psy- 


chol., 1932, 16, 123-130. 


. Strong, E. K., Jr. Classification of occupations by interest. Person. J., 1934, 


12, 301-313. 
Super, D. E., and Roper, Sylvia. An objective technique for testing vocational 
interests. J. of appl. Psychol., 1941, 25, 487-498. 


Thurstone, L.L. A multiple factor study of vocational interests. Person. J., 1931, 
10, 198-205. 








Road Accidents: Pedestrians’ Beliefs Regarding 
Visibility at Night * 


H. H. Ferguson 
University of Otago, New Zealand 


Attention has already been drawn to the discrepancy which may exist 
between what the pedestrian believes his visibility to be and what it 
actually is! It has been pointed out that probably many pedestrians 
overestimate their visibility and this has been suggested as a cause of 
road accidents. 

This article describes the collection of further data and gives an 
analysis and discussion of them. 


Subjects and Method 


The subjects were sixty-nine volunteers from the first year Psychology 
Class at the University. The general method was to get the subjects to 
walk along a straight road, either away from, or towards, the headlights 
of a stationary automobile. They had to indicate by depositing small 
pegs, th se points along the road at which they held certain beliefs about 
their visibility. In the previous investigation the subjects were asked to 
indicate, for example, when they believed with absolute certainty that 
they were just beyond the range of the driver’s visibility. But visibility 
is a matter of degrees. Consequently, the phrase “beyond the range of 
the driver’s visibility” is capable of different interpretations by different 
subjects. In this investigation, therefore, an attempt was made to 
standardize the interpretation of “visibility.” By “visible” was to be 
understood, “just visible as a pedestrian.”” Further, a pictorial demon- 
stration of the meaning of “just visible as a pedestrian” was given. That 
was done in the following way. A picture of a pedestrian was projected 
on a screen (by means of an epidiascope) and a piece of differentially 
smoked glass introduced into the projected beam. In this way a demon- 
stration of varying degrees of visibility was given and the meaning of 
‘just visible as a pedestrian” made clear. All this implied a departure 


* My thanks are due to Mr. W. B. B. McDowell and Inspector W. P. Gibson of the 
New Zealand Government Transport Department, for supplying various facilities, 
including the automobiles, for carrying out this investigation, and to Mr. C. F. Wrigley, 
who so willingly and ably assisted me each evening. 

1H. H. Ferguson and W. R. Geddes, J. occupa. Psychol., 1940, 14. 


109 





110 H. H. Ferguson 


from the previous procedure of keeping each of the subjects in ignorance 
of what he had to do until he was asked to make his observations. On 
this occasion each subject was supplied with a copy of the following: 


University of Otago, Psychological Laboratory 


An Investigation of Pedestrians’ Beliefs Regarding Their 
Visibility to Automobile Drivers 


The various subjects taking part in this investigation are required to walk, 
either away from, or towards, the headlights of a stationary automobile. The 
have to indicate (by the placing of pegs) the points along the road at whic 
they hold certain beliefs regarding their visibility to the “driver.” It is, of 
course, understood that the ‘‘driver” has normal vision. 

Visibility is a matter of degree, ranging from the absolutely invisible to the 


— visible. For the purposes of this investigation the following scale of 
visibility is to be used: 


absolutely visible only just visible as easily distin- very clearly 
invisible as an object a pedestrian guishable as visible as a 
a pedestrian pedestrian 


Subjects’ estimations are to be given in terms of “just visible as a pedestrian.” 


Now here are the instructions: 


Part A 
Proceed along the road with your back to the automobile. 


When you have reached the last point at which you believe with absolute cer- 
tainty that you are just visible as a pedestrian take one of your pegs and place it 


on the channelling. When depositing the peg be careful not to look back at 
the headlights. 


Then continue along the road until you reach a point at which you believe that 
there is a 50-50 chance that you are just visible as a pedestrian. Take another of 


your pegs and place it on the channelling. Again be careful not to look back 
at the headlights. 


Then proceed further along the road until you reach the first point at which 
you believe with absolute certainty that Fn are no longer just visible as a pedestrian. 


Now deposit your last pes: Again do not look back at the headlights. Walk 
on towards the Art Gallery. 


Part B 
Proceed along the road towards the automobile. 


When you have reached the last point at which om believe with absolute 
certainty that you are beyond the range of being just visible as a pedestrian, place 
a peg on the channelling. 


Then continue along the road until you reach a point at which you believe 
that there is a 50-50 chance that you are just visible as a pedestrian. At this point 
deposit your second peg on the channelling. 


Then continue to the first point at which you believe with absolute certainty 
that you are just visible as a pedestrian. Deposit your last peg and proceed to 
the automobile. 





Road Accidents 111 


All subjects will be asked to carry out the instructions in both Part A and 
Part B, ie 50 per cent of the subjects will be asked to carry out the instructions 
in Part rst. 


Make a note of your introspective observations and commit them to paper 
at a earliest possible moment. Also make a note of any comments you wish 
to make. 


Part A of these instructions was repeated verbally to each subject 
immediately before he attempted to carry out the instructions contained 
in Part A. Similarly with Part B. 

Readers specially interested in other changes in the instructions 
should consult their earlier form.? 


Special Conditions of this Investigation 


This investigation was carried out on a straight, level, unsealed, 
“blind-alley,”’ unlighted road, 1330 feet in length and 20 feet wide. The 
automobile was drawn up on the left hand side of the road with headlights 
facing towards the terminus of this road but 1198 feet from it. On the 
left-hand side of the road, facing in the direction of the automobile, was a 
strip of light grey cement channelling 19 inches wide, and a row of dark 
grey poplars (Populus monilifera) 12 feet from the channelling. These 
poplars were 40 feet apart and 6 inches in diameter, 3 feet from the 
ground. On the other side of the road and 9 feet 6 inches from it ran a 
row of middle grey electric power poles, 160 feet apart. The first of these 
was 66 feet ahead of the automobile headlights. Behind 'the automobile 
was a road with 100 C.P. series street lamps. The cement channelling, 
the power poles and the poplars all reflected a readily noticeable amount 
of the light from the headlights.* 

Observations were made on five successive evenings. Weather condi- 
tions during each evening remained very constant and the difference from 
evening to evening was small. August 5th: No moon, slight frost haze, 
starlit; 6th: Ditto; 7th: No stars, obscured moon; 8th: Few stars, moon 
sufficient to pick up second-hand on pocket watch; 9th: Ditto. 

The headlights appeared to remain very constant during each evening 
as judged by readings on a foot-candle meter. On Monday and Tuesday 
a Ford V8 (1934) was used and on the other evenings a Ford V8 (1936). 

The greatest distances at which a typically attired, moving pedes- 
trian, was “just visible as a pedestrian,” were approximately 206, 217, 268, 
243 and 246 feet on the five successive evenings. 

These figures are not precise; they are almost certainly too large. 
They are based on observations taken by the author, whose visual acuity, 

* Ibid., pp. 199-200. 


* The road used in the previous investigation was 12 miles from the University and 
difficult of access. Hence this change. 








112 H. H. Ferguson 


corrected, was 6/6 and whose night vision, on the basis of W. D. Wright’s 
Test for night vision,‘ was only slightly below average. Also, the auto- 
mobile was stationary, the windscreen well wiped and the observer, who 
had not been exposed to any recent glare, was definitely on the look out 
for pedestrians. 


Data Obtained 


Subjects’ Estimations. Table 1 gives the Mean and the Range of the 
estimations obtained on each evening. Estimations for subjects starting 


Table 1 





Estimations 
Num- 
ber of Actual Inward 
Sub-  Visi- 
jects bility 1 2 3 3 2 











7* 182.6 267.7 309.6 387.4 496.7 
146-218 203-319 246-481 257-503 341-638 


8 256.1 347.9 4768 4010 542.0 
132-431 208-550 295-753 158-593 335-712 


™ 144.9 264.1 370.7 257.7 398.7 
47-348 128-438 219-539 109-477 241-658 


7 198.6 329.1 4340 421.1 601.1 
127-267 225-420 290-600 266-667 398-781 


6* 173.3 2638 4140 411.8 62/78 
107-249 174-376 298-548 320-481 493-762 


7 292.0 4248 629.7 442.7 630.1 
135-481 204-650 465-866 344-576 512-761 


6* 168.3 234.7 3860 294.5 432.3 
101-219 121-305 292-476 107-477 292-545 


8 222.0 3514 5616 5008 688.6 
104-564 212-744 335-937 347-678 450-804 


7* 167.9 288.1 478.3 372.7 561.3 
88-264 188-388 267-666 117-621 223-753 


246 
6 230.7 3478 481.5 375.0 558.3 
90-345 229-448 441-601 218-519 417-676 
































* Subjects starting with the outward journey. 


with the outward journey are kept separate from those starting with the 
inward journey. All readings are given in feet. 


‘ Pitman, 1941. 








Road Accidents 113 


Table 2 shows how the subjects’ estimations scatter on either side of 
the last points at which the typically attired pedestrian could just be seen 
as a pedestrian. It consists of six frequency distributions with class- 
intervals of 20 feet. 

Introspective Notes and Comments. The subjects responded very well 
to the request for introspective notes and comments. These proved to be 
full of interest. 


Discussion 


The data now presented amply substantiate the previous finding, viz., 
that many pedestrians tend to overestimate the extent to which they are 
visible to the drivers of automobiles at night. 


Table 2 





Frequency of Estimations 
Outwards Inwards 
Feet 1 2 3 3 2 1 











780-799.9 1 
760-779.9 
740-759.9 
720-739.9 1 
700-719.9 


680-699.9 1 
660-679.9 
640-659.9 
620-639.9 
600-619.9 


580-599.9 1 
560-579.9 

540-559.9 1 
520-539.9 

500-519.9 1 


480-499.9 1 
460-479.9 

440-459.9 1 
420-439.9 
400-419.9 


380-399.9 1 
360-379.9 

340-359.9 2 
320-339.9 1 

300-319.9 








— bo 





_ 
_ 





moO = oo 


et el oe 
KF wWOnNWwWRr WWwWRN) POF OA!) POH Re 


Onrr Rr, WOWNe ST 














H. H. Ferguson 
Table 2—Continued 





Frequency of Estimations 
Outwards Inwards 
Feet 2 











280-299.9 
260-269.9 
240-259.9 
220-239.9 
200-219.9 


ao » » CW wo 


Nowre 





180-199.9 
160-179.9 
140-159.9 
120-139.9 
100-119.9 


Non OP 





80— 99.9 
60— 79.9 
40— 59.9 
20— 39.9 

+0- 19.9 
4}. 





—0- 19.9 
20- 39.9 
40- 59.9 
60— 79.9 
80— 99.9 


No wr dw © Qr-mnaIQ wo “Int » — bo 
No wo © P to 





100-119.9 
120-139.9 
140-159.9 
160-179.9 


Nr an rk © Oe NTP DON 





The subjects’ estimates show again a wide range of distribution. On 
the previous occasion we thought it probable that this wide range of dis- 
tribution was, in part, due to the fact that “visibility” was not defined in 
our instructions. The part played by such lack of definition would now 
in this respect appear to be small. 

A discrepancy of similar magnitude is again found between “‘outward”’ 
and “inward” estimates. When estimating the last (or first) point at 
which they were absolutely certain that they were just visible as pedes- 
trians, 29 per cent gave readings greater than the standard when moving 
from, and 87 per cent when moving towards the headlights. 

It will be noted from Table 1 that, with the exception of the Ist and 
2nd mean estimations on the inward journey on the Friday evening, that 





Road Accidents 115 


all these mean estimations are smaller for subjects starting with the 
outward journey. This general difference is probably due, in part, to the 
influence, more or less witting, of the general tendency to give larger 
estimations on the inward journey. Of the 207 estimations for the inward 
journey only 3 were less than the corresponding estimations for the out- 
ward journey and that only by a total of 7 feet. (Two subjects were 
responsible for these readings.) 

From a theoretical point of view the tendency to underestimate visi- 
bility is just as interesting as the tendency to overestimate visibility. 
From a practical point of view, too, an understanding of how some sub- 
jects tend to underestimate their visibility may also be important. 

The introspective notes are important. They indicate the presence 
of various untenable beliefs, which are in part responsible for the tendency 
to overestimate visibility. For example: “If I appear myself to be clearly 
illumined, I am seen.”” This particular belief, probably held more fre- 
quently in an implicit form, would seem to be important in connection 
with the wearing of light-coloured or white clothing. Without something 
like experimental evidence regarding the actual distance at which persons 
so attired can be seen, it is possible that such clothing may produce 
grossly exaggerated beliefs regarding visibility. That may be dangerous 
under certain conditions of illumination. Without fairly exact knowledge 
regarding one’s visibility, the wearing of white may be a death-trap. It 
would be interesting to know how subjects, clad in white, would react. 

The glare or dazzle effect of headlights would seem to be guilty not 
only of annoying and “blinding” road users, but of deluding them. We 
have observed how pedestrians may believe they are seen because they 
are dazzled. Yet, itis possible, that the dazzle effect might be quite useful 
provided automobiles had a high degree of uniformity in their headlights 
and pedestrians and other road users ascertained by actual experiment 
the degrees of glare effect necessary for visibility. 

The difficulty experienced by the subjects in making use of their 
observations as drivers or occupants of automobiles, is also in part due to 
lack of uniformity in automobile lighting arrangements. The following 
note by A. J. T. may be quoted by way of illustrating the problem: “‘Walk- 
ing towards the headlights, I felt that the placings I had made on the way 
out were underestimated by a large amount and consideration of the 
brightness of the lights made me deposit my pegs much further from the 
car.” A.J. T.’s readings, relative to the standard, were —30, +48, +134 
feet on the outward journey, and +582, +496, +375 feet on the inward 
journey. 

The introspective notes remind us that certainty of belief is no guar- 
antee of the accuracy of the belief. 





116 H. H. Ferguson 


It seems clear that the most important practical thing to do imme- 
diately is to make the findings of this and the previous investigation 
widely known, not only by telling people about it but by encouraging 
them to carry out this experiment for themselves. 


Questions Yet to be Answered 


To what extent does age influence this delusion? To what extent do 
the form, position, quality and intensity of the headlights, or any other 
possible illumination of the vehicle, influence the delusion? What is the 
effect of moving headlights? To what extent do differing atmospheric 
conditions, for example, rain and fog, street lighting, road surface, etc., 
influence the delusion? To what extent does the delusion exist under 
ordinary street-traffic conditions? What is the effect of wearing light 
clothing? How persistent is this delusion? What would be the effect of 
trying out pedestrians and of giving them knowledge of how they erred? 
And, looking at the question from the driver’s point of view, to what 
extent may he be deluded regarding the distance at which he can perceive 
objects? 

Summary 


The tendency of pedestrians to overestimate the degree to which they 
are visible to automobile drivers at night was investigated. There were 


sixty-nine subjects, University students. The general method was to 
get the subjects to walk either away from, or towards the headlights of a 
stationary automobile, and to indicate, by the depositing of pegs, the 
points along the road at which they held certain beliefs about their 
visibility to the “driver.”” When moving away from the headlights the 
subjects were asked to indicate, for example, the last point at which they 
were absolutely certain that they were just visible as a pedestrian. The 
data thus obtained indicate that an appreciable percentage of pedestrians 
overestimate their visibility to what may reasonably be held to be, a 
dangerous degree, while introspective notes indicate the presence of 
various untenable beliefs which help to account for this tendency. It is 
suggested that the immediate need, from a practical point of view, is to 
“educate” road users regarding this source of error. A possibly dangerous 
effect of the wearing of white clothing is mentioned; a greater degree of 
uniformity in automobile lighting arrangements is recommended, and sug- 
gestions for future enquiry are made. 


Received March 11, 1943. 





Legibility of Newspaper Headlines Printed in Capitals 
and in Lower Case * 


Keller Breland and Marian Kruse Breland 
University of Minnesota 


Previous investigations by Paterson and Tinker ' have demonstrated 
that paragraphs printed in capitals are considerably less legible than 
those printed in lower case, legibility being measured by total amount 
read in a given time. The difference in their experiment was 11.8 per 
cent in favor of the lower case. In other words, there is almost 12 per 
cent loss when the text is set up in all capitals. 

In correspondence with Mr. Theodore Bernstein of the New York 
Times the following questions arose: Would this difference remain when 
the task presented to the subject was to read a newspaper headline? 
Might not the capitals be more legible when the subject was asked to 
read a short block of type at a glance? Would it be necessary to double 
the size of type in which lower case headlines were printed in order to 
secure legibility equal to or greater than that of the capitals? It was 
these questions which the present experiment was designed to answer: 
in general, to compare the legibility of newspaper headlines printed com- 
pletely in capitals with the legibility of those printed in capitals and in 
lower case where the initial letters of the important words only are 
capitalized.? 

For purposes of the present study, legibility is defined in terms of the 
number of words read during a brief exposure time and correctly reported 
by the subject immediately after the exposure. 

Since the total problem was obviously a complex one, it was decided 
to concentrate for the present study on headlines of uniform length and 
type-face. The length to be used was determined by a count of 914 
single-column headlines in the New York Times: mean = 4.64; 961 head- 
lines in four local papers: mean = 5.49; all 1875 headlines: mean = 5.25. 


* We would like to express our appreciation to Dr. M. A. Tinker and Prof. D. G. 
Paterson for their advice and assistance in the research and in the preparation of this 
paper. 

1 Paterson, D. G., and Tinker, M. A. How to make type readable. New York and 
London: Harper and Bros., 1940, pp. 22-26. (Obtainable from the authors.) 

? All words except articles, prepositions, conjunctions, etc., unless one of these is the 
initial word in the headline. 

117 





118 Keller Breland and Marian Kruse Breland 


The material used in this investigation, therefore, consisted of 120 five- 
word newspaper headlines. The texts were taken from December, 1940, 
and January, 1941, issues of the New York Times. The single-column 
headlines were printed in 24 pt. Cheltenham bold-face, extra-condensed 
type in two lines on newsprint. They were then pasted on 544” x 33’ 
tachistoscope cards which had first been covered with newsprint paper 
stock. 

Each of the 120 headlines was printed twice, once in capitals and 
once in lower case. As they came from the printer, the headlines in 
capitals were numbered in order of printing and randomized from a table 
of random numbers. Since those in lower case were identical, they were 
given the same numbers and the same random order. Two sets were 
then organized in such a way that each set contained all the headlines: 
The first, third, fifth, etc., of Set Aa were in capitals; the second, fourth, 
sixth, etc., were in lower case. The first, third, fifth, etc., of Set aA were 
in lower case; and the second, fourth, sixth, etc., were in capitals. For 
example, 

Headline 
No. Set Aa Set aA 
1 PARAGUAY OBJECTS Paraguay Objects . 
TO BOLIVIAN PLAN to Bolivian Plan 


Cocker Spaniels COCKER SPANIELS 
Lead Breed List LEAD BREED LIST 


Subjects 1, 3, 5, etc., were given Set Aa and Subjects 2, 4, 6, etc., 
were given Set aA. In other words, each subject read all the headlines: 
for the first subject, one half was in capitals and the other half in lower 
case. For the second subject, this order was reversed. 

The above experimental design was used in order to keep the results 
free from practice and fatigue effects, differences in difficulty of the head- 
lines, and differences in ability of the subjects. 

Twenty-two senior and graduate students at the University of Minne- 
sota served as subjects. The apparatus used was the Dodge Mirror 
Tachistoscope.* The subjects were instructed to write down on the 
forms provided as much of the headline as they were able to remember. 
If they were not sure of some words, they were told to guess. Following 
the reading of the instructions, each headline was exposed for 50 milli- 
seconds (1/20th of a second) in the tachistoscope, and the subject then 
wrote on the blank as many words as he could. The first twenty head- 

* Dodge, Raymond, An improved exposure apparatus, Psychol. Bull., 1907, 4, 10-13. 
Also see Tinker, M. A., A noiseless exposure apparatus, Amer. J. Psychol., 1931, 43, 
640-642. 





Legibility of Newspaper Headlines 119 


lines were a practice series. After 60 headlines (which included the 
practice series), there was a brief pause in which the subjects could relax 
and rest their eyes. 

The responses were scored by recording the number of words correctly 
reported for each headline, regardless of the order in which they were 
reported. In order to avoid arbitrary decisions about correctness of 
reporting, misspelled words were counted as wrong. ‘Two scores were 
then computed for each subject,—the sum of scores on 50 lower case 
headlines and the sum of scores on 50 all capitals headlines. 

The results of the experiment were as follows: the mean lower case 
score was 116.32; the mean all capitals score was 94.32. The mean 
difference was 22.00. In twenty cases, the difference was in favor of the 
lower case; there were two cases of reversal (the differences in these cases 
were —3 and —11). Scores on lower case and scores on capitals corre- 
lated to the extent of +.90. 

The distribution of differences was analyzed by calculating Student’s 
t, which in this case is equal to 6.33. When the significance of the differ- 
ence between the means is analyzed according to this test, it is found 
that less than once in 10,000 times would a difference equal to or greater 
than this be obtained solely through errors of random sampling, i.e., the 
difference is highly significant. 

The percentage difference was calculated as the ratio of the difference 
to the mean of the lower case scores. This difference is 18.9 per cent. 
In other words, there is 18.9 per cent loss in reading headlines set in all 
capitals in comparison with lower case. 

The data were further analyzed to determine if there were any differ- 
ences between scores on the two orders of presentation. It was found 
that the subjects tested on Set Aa reported more words correctly from 
headlines both in lower case and in all capitals (the mean performances 
are 116.36 and 95.82 respectively in the Aa group, and 116.27 and 92.82 
in the aA group). In neither case are these differences significant (the 
respective probabilities are .99 and .82). Differences this large might 
easily arise through sampling errors.‘ 

These results indicate that for the length and type face here con- 
sidered, headlines printed in lower case are considerably more legible than 
those printed entirely in capitals, when legibility is defined as the number 
of words which can be correctly reported after a “glance” at the headline. 
The difference is large, statistically highly significant, and based on scores 


‘ At least two factors, perceptual span of the two groups of subjects, and compara- 
tive difficulty of the headlines printed in lower case and in capitals in the two sets, are 
involved here, but since the differences are not significant, there is little point to further 
analysis. 








120 Keller Breland and Marian Kruse Breland 


which are presumably quite stable since they represent a composite of 
the scores on 50 similar items. Furthermore, the percentage difference 
is larger than that previously obtained in the reading test comparison 
by Tinker and Paterson.® 

In view of these results, it seems incontrovertible that a real difference 
exists in favor of single-column headlines printed in lower case versus all 
capitals. Moreover, it does not seem necessary to compare 24 pt. capi- 
tals with 36 or 48 pt. lower case, since 24 pt. lower case headlines are 
obviously markedly superior in legibility to headlines in the same size 
capitals. 


Received February 16, 1943. 


5 Op. cit., p. 23. 














of 
ice 
on 


ice 





A Comparison of Norms for the Minnesota Rate of 
Manipulation Test 


Jacob Tuckman 
Jewish Vocational Service, Cleveland, Ohio 


The Minnesota Rate of Manipulation Test ! is used widely in industry 
to select workers for a variety of semi-skilled jobs where speed in the 
handling of materials or tools is important. It is a modification of the 
original Minnesota Manual Dexterity Test developed by W. A. Ziegler 
for the Minnesota Employment Stabilization Research Institute, to 
determine the length of time required for the movement of the pieces in 
the Minnesota Spatial Relations Test. The apparatus is a wood board 
containing 60 cylindrical holes, arranged in four rows of 15, into which 
60 slightly smaller blocks can be placed. The original board consisted of 
58 holes. The test has two parts: a Placing test, in which the subject, 
using one hand, places the blocks into the holes in a definite order, from 
a fixed position; and a Turning test, in which the subject picks up a block 
with the left hand, turns it over, and replaces it in the same hole with the 
right hand, alternating hands for each subsequent row. The Placing test 
is designed to measure speed of hand manipulation; the Turning test, 
speed of finger manipulation. Five trials are given compared with four 
in the original test,—one practice and four test trials. The score is the 
number of seconds required to complete the last four trials. 

There is little information regarding the validity of this test. Packers 
and wrappers have been found to be considerably above the average of 
workers in general in speed of hand and finger manipulation, but these 
findings apply to the original board. In the manual accompanying the 
test, no description of the population on which the test was standardized 
is given. No reliability is reported. A correlation of .57 between hand 
and finger speed is reported for 500 subjects, indicating that Placing and 
Turning are measuring rather different traits. The norms are considered 
as suitable for use beginning with 15 year olds. Distributions for both 
men and women were found to be practically identical. 

This study will not concern itself with the validity or reliability of the 
Minnesota Rate of Manipulation Test, but will be limited to an examina- 
tion of the suitability of the norms published by the Educational Test 
Bureau. Inspection of the test scores of individual cases indicated that 


1 Distributed by Educational Test Bureau, Minneapolis, Minn. 
121 











Jacob Tuckman 


subjects tended to do better on Turning than on Placing. It was decided 
to investigate this problem more intensively after a preliminary study of 
the test records of 100 cases showed that 80 per cent scored higher on 
Turning. 

Test scores for both Placing and Turning were available for 1117 
individuals (98 per cent were Jewish) referred by the Placement and 
Vocational Counseling Departments of the Cleveland Jewish Vocational 
Service. For more diagnostic analysis, these subjects were divided into 
four groups: 195 high school boys (grades 9-12), 170 high school girls, 407 
men, and 345 women. The boys ranged in age from 13.3 to 19.3 years; 
the girls, from 13.9 to 18.5 years. The mean ages were 15.89 and 15.75 
respectively. The median school grade was 10B (the first half of the 
tenth year) for both boys and girls. Both school groups were superior 
in intelligence on the basis of the American Council on Education High 
School Examination (1938, 1939, and 1940 editions) and the Terman 
Group Test of Mental Ability. The percentile range for boys was from 
4 to above 99; for girls, from 1 to above 99. The median percentile ranks 
were 76 and 75, respectively. 

The adult group was comprised almost entirely of individuals seeking 
employment, with a small number interested in training or upgrading 
on the job. Approximately 90 per cent of the adult group were individ- 
uals in need of short-term counseling or were difficult to classify occupa- 
tionally. Many had had little or no work experience and needed further 
occupational exploration in terms of their interests and abilities. Of 
those individuals with a previous work history, the large majority of both 
men and women had been employed as sales clerks, stock and shipping 
clerks, messengers, and routine clerical workers. The remaining 10 per 
cent of the adult group, of which more than half were refugees, and of 
which a very small proportion were students attending college, were in 
need of long-term counseling because of the necessity for training or 
retraining. 

The men ranged in age from 16 to 58 years; the women, from 17 
to 46 years; with mean ages of 21.9 for both groups. Forty-eight 
per cent of the men and 52 per cent of the women were high school gradu- 
ates. Many had completed some college work. The percentile rank for 
intelligence ranged from 2 to 97 for men, from 1 to 95 for women, with 
median percentile ranks of 74 and 52, respectively, on the basis of the 
Pressey Senior Classification Test and the Otis Higher Examination. 

Norms for Placing and Turning, which were developed at the Cleve- 
land Jewish Vocational Service for high school students, men, and women, 
are presented in Table 1. The distribution of the test scores and the 
changing age base warranted the setting up of separate local norms. 





Minnesota Rate of Manipulation Test 123 


Table 1 


Cleveland Jewish Vocational Service Norms for Placing and Turning for High School 
Students, Men, and Women 





Placing Turning 
(Time in Seconds) (Time in Seconds) 


High School High School 
Per- Students Men Women Students Men Women 
centile N =365 N =407 N =345 N =365 N =407 N =345 








99 190.0 174.6 180.0 143.0 131.6 147.0 
200.5 190.5 192.8 152.4 142.1 150.0 
208.4 199.4 201.0 157.7 149.6 153.7 
212.5 203.4 204.9 162.1 152.5 157.4 
216.0 207.3 208.8 164.3 155.3 161.1 
219.5 210.5 212.2 166.6 158.2 163.6 
222.1 212.8 215.4 168.9 161.1 165.9 
224.8 215.1 218.7 171.1 163.4 168.2 
227.4 217.4 221.3 173.4 165.7 170.5 
230.0 219.7 223.7 175.6 168.0 172.9 
232.5 221.8 226.1 177.7 170.2 175.6 
235.1 223.9 228.5 179.9 172.5 178.4 
237.7 226.0 231.3 182.1 175.1 181.2 
239.7 228.0 234.3 184.2 178.0 184.1 
243.5 230.6 237.3 187.3 180.9 187.3 
246.6 234.6 240.6 191.6 183.2 190.8 
249.8 238.5 244.9 187.5 194.4 
254.9 244.0 248.9 : 192.3 198.7 
260.3 250.5 255.6 . 197.1 207.7 
270.7 259.4 266.0 : 205.8 220.6 
282.9 289.2 292.3 t 244.7 320.8 


65 
60 
55 
50 
45 
40 
35 
30 
25 
20 
15 
10 

5 

1 





The mean score (time in seconds), standard deviation, percentile rank 
equivalent of the mean on the basis of the Educational Test Bureau 
norms for both Placing and Turning for boys, girls, men, and women, and 
for all four groups combined are presented in Table 2. The significance 
of the difference between the means of the four groups is given in Table 3. 

In comparing the performance of the Cieveland groups with the norms 
of the Educational Test Bureau, certain differences are evident. For 
Placing, the performance of both boys and girls is almost identical with 
the norms of the Educational Test Bureau, but that of men and women is 
considerably faster. The difference between the means for boys and men, 
and for boys and women is statistically reliable. There is a significant 
difference between girls and men, but not between girls and women. 
However, in comparing the mean score of boys and girls combined with 
that of women, the difference is significant. Men tend to be faster than 
women, but the difference is not reliable. For Turning, the performance 





124 Jacob Tuckman 


Table 2 


Mean Score (Time in Seconds), Standard Deviation, Equivalent Percentile Rank of the 
Mean for Placing and Turning for High School Boys, High School Girls, Men, 
Women, and for All Groups Combined, 1117 Subjects 





Placing 


Percentile 
Group N Mean Rank ae dist. 








High School Boys 195 233.4 49 19.96 
High School Girls 170 233.4 49 21.47 
Total School Group 365 233.4 49 20.67 
Men 407 223.2 66 20.92 
Women 345 227.6 59 23.33 
All Groups Combined 1117 227.9 58 22.02 





Table 3 
Comparison of the Mean Score for All Groups for Placing and Turning, 1117 Subjects 





Placing Turning 
ee 
D o diff. o diff. o diff. 








High School Boys and Men 10.27 1.7642 5.82 , 1.9323 
High School Boys and Women 5.85 1.9012 3.08 t 2.2054 
High School Girls and Men 10.29 1.9460 ; ] 1.8611 
High School Girls and Women 5.87 2.0711 ; ; 2.1434 
Total School Group and Men 10.28 1.4986 . . 1.5287 
Total School Group and Women 5.86 1.6578 ‘ d 1.8620 
Men and Women 4.42 1.6289 A ; 1.7915 





of all groups is considerably more rapid than the norms of the Educational 
Test Bureau. Adult men are significantly faster than any of the other 
groups. Women show the greatest variability and tend to be faster than 
the combined school group, but the difference is very small and is not 
statistically reliable. 

The more rapid performance of men and women as compared with the 
norms of the Educational Test Bureau may possibly be explained by the 
fact that almost all were seeking employment and apparently were anxious 
to do well. Age undoubtedly was a selective factor. However, this 
would not explain the difference in performance between Placing and 
Turning. The difference in performance between the school and adult 
population may be due to the fact that the former has not reached speed 
maturity. The sex difference among adults may be attributed to the 
higher intelligence of the men and a greater interest on their part in 
factory work. 





Minnesota Rate of Manipulation Test 125 


For both Placing and Turning, the distributions of all groups show a 
tendency for the scores to cluster toward the upper end of the scale 
(Table 4). For Placing, the distributions of all groups show no signifi- 


Table 4 
Skewness * of the Distributions of All Groups for Placing and Turning, 1117 Subjects 





Placing Turning 


SK 
Group N SK o sk o sk SK o sk 








High School Boys 195 — 2.44 1.89 —1.29 —6.55 2.06 
High School Girls 170 —1.19 2.14 — .56 —5.37 1.92 
TotalSchool Group 365 — 1.84 1.41 — 1.30 —6.01 
Men 407 —3.14 1.31 — 2.40 —3.13 1.22 
Women 345 —2.17 1.52 — 1.43 —5.13 





— Pot Pro _ 


*SK 2 


P 50- 

cant divergence from the theoretical normal curve. For Turning, all 
distributions show a greater divergence from normality. For girls and 
men the skewness is not significant; for women, and for boys and girls 
combined the skewness is significant. The correlations between Placing 
and Turning for all groups (Table 5) are higher than that of .57, obtained 


Table 5 
Correlation Between Placing and Turning for All Groups, 1117 Subjects 





Group r P.E., 





High School Boys ; .03 
High School Girls d .03 
Men ‘ .02 
Women ; .02 
All Groups Combined ‘ O01 





by Zeigler in a group of 500 subjects, but none of the differences is 
statistically significant. In comparing the correlation obtained for all 


: D ‘ ; ioe : 
four groups combined, the PE an 5.67 and is statistically reliable. 


The tendency of subjects to perform more rapidly on Turning than 
on Placing is evident in other samplings. Through the courtesy of Dr. 
Henry 8. Curtis,? data on Placing and Turning were made available for 


? Department of Vocational Guidance and Placement of the Cuyahoga County 
Child Welfare Board. 





Jacob Tuckman 


326 male and female subjects,—191 fourteen year olds, 63 fifteen year 
olds, and 72 sixteen year olds and over. This group differed from the 
Jewish Vocational Service school population in that the average intelli- 
gence was considerably lower. The average I.Q. was 92 for males and 
94 for females. The percentile rank equivalent of the mean for Placing 
and Turning, respectively, is 33 and 43 for fourteen year olds, 39 and 44 
for fifteen year olds, 54 and 60 for sixteen year olds and over. 

It would be interesting to compare our findings with those of Tee- 
garden,’ who developed norms for Placing and Turning for the Cincinnati 
Employment Center, based on the performance of 500 males and 360 fe- 
males. Selective factors were in operation as compared with the Cleve- 
land group. The age range was restricted to the decade 16-25 years. 
Only those applicants were tested who indicated a willingness to be con- 
sidered for any type of job available. Applicants interested in clerical 
or sales jobs only were not included in the sample. The Cincinnati study 
showed a tendency for women to be faster than men for both Placing and 
Turning, although the differences were not significant; the Cleveland 
study showed men to be faster, not significantly for Placing, but signifi- 
cantly for Turning. This is in error.‘ Women were significantly faster 
than men for Placing. The difference between the means for men and 
women for Placing is more than four times its standard deviation. In 
the Cincinnati study, men were slower for Placing as compared with the 
norms of the Educational Test Bureau, but more rapid for Turning; 
women were faster on both tests. The performance of both men and 
women was more rapid for Turning than for Placing, although the differ- 
ence is small for women and is in agreement with the Cleveland study. 
The performance of women in both studies was similar. In the Cincin- 
nati study, the mean score of women for Placing and Turning was 226.6, 
with a standard deviation of 20.65, and 182.0, with a standard deviation 
of 26.90, respectively; in the Cleveland study, the mean score was 227.6, 
with a standard deviation of 23.33, and 180.1, with a standard deviation 
of 27.36, respectively. Teegarden indicates that the slower performance 
of males as compared with the norms of the Educational Test Bureau may 
be due to the fact that the Placing test, with its loosely fitting blocks, was 
too easy to challenge their best efforts, and that many men were difficult 
to motivate because they thought that the colored blocks were children’s 
toys. The Turning test appeared to challenge their efforts more effec- 
tively because it presented a more difficult task. 

* Teegarden, Lorene, Manipulative performance of young adults at a public em- 
ployment office. PartsI & II. J. appl. Psychol., 1942, 26, 633-652; 754-769. 

4 J. appl. Psychol., 1942, 26,759. (Editor’s Note: Errata published by L. Teegarden 
corrects the error in J. appl. Psychol., 1943, 27, 206.) 





Minnesota Rate of Manipulation Test 127 


For males (Cincinnati) the skewness is significant for both tests; for 
men (Cleveland) the skewness is insignificant for both tests. For women, 
in both studies, the skewness is insignificant for Placing and significant for 
Turning. In both studies, the correlation between Placing and Turning 
is higher than that reported by the Educational Test Bureau. For 230 
males (Cincinnati) the correlation was .73; for 171 females the correlation 
was .65. In the Cleveland study, the correlations between Placing and 
Turning for 407 men and 345 women were .66 and .60, respectively. 

Intelligence ratings on the basis of the Pressey Senior Classification 
Test were available for 302 women and 327 men, and on the American 
Council on Education High School Psychological Examination—1938, 
1939, and 1940 editions,—for 124 girls and 103 boys. For the school 
group, all scores on the 1938 and 1939 editions were converted to equiva- 
lent 1940 scores. The correlations between intelligence test scores and 
scores on Placing and Turning for each of the four groups are presented in 
Table 6. For the combined school group and combined adult group there 


Table 6 
Correlation Between Intelligence and Placing and Turning for All Groups, 856 Subjects 





Intelligence 


Score and 





ACE or Pressey 
Placing Turning Score Placing 





N P.E,r r P.E* Mean cedist. Mean oa dist. 





High School Boys 103 “ d 33 d 24.18 230.4 18.45 
High School Girls A \\ : . aver i 24.72 2299 18.33 
Total SchoolGroup 227. F mm .. : 24.52 230.1 18.39 
Men 327 > d .22 d r 17.76 223.5 20.48 
Women 302 A d -25 ‘ < 18.54 227.0 22.96 
Total Adult Group a i me 3 J 19.93 225.2 21.71 





is little difference in the correlation between intelligence and Placing and 
Turning. All the correlations are low. There is a closer relationship 
between intelligence and Turning as compared with intelligence and 
Placing for all groups. This is not surprising, since Turning is a more 
difficult task. However, it is fairly evident on the basis of the low correla- 
tions obtained that Placing and Turning are measuring non-intellectual 
functions. 

The correlation between score on Placing and Turning and age for all 
four groups is givenin Table 7. For boys the correlation between age and 
Placing is low. All other correlations indicate a negligible relationship. 
The correlation of .08 between age and Placing for both men and women, 
and the correlations of .01 and .07 between age and Turning for men and 
women, respectively, are probably due to the heterogeneity of the adult 





128 Jacob Tuckman 


Table 7 
Correlation Between Age and Placing and Turning for All Groups, 1117 Subjects 





Age and Age and Age 
Placing Turning 





Mean o dist. 
r P.E., (Years) (Years) 


N P 


. 





High School Boys 195 
high School Girls 170 
Men 407 
Women 345 
All Groups Combined 1117 


19 15.89 1.36 
04 ’ 15.75 1.23 
01 d 21.93 4.49 
07 d 21.87 4.47 
08 19.92 4.70 


SRSRe| 





population. Speed of reaction attains a maximum somewhere between 
the 22nd and the 28th year. Insofar as younger and older subjects have 
been included in the population, the correlation will tend to reduce to zero 
unless the population is heavily saturated with individuals over thirty 
years of age. In the latter case, the correlation will tend to be negative. 
The breakdown for age for the adult population is presented in Table 8. 


Table 8 
Mean Score by Age for Men and Women for Placing and Turning, 822 Subjects 





Male Female 





Placing Turning Placing Turning 








Age 
Group N Mean cedist. Mean edist. N Mean edist. Mean oe dist. 





17-20 253 4224.5 21.39 173.5 20.76 213 1 21.11 178.2 18.60 
21-24 96 221.1 21.25 1704 20.28 67 4 22.31 177.5 22.44 
25-28 42 2243 18.19 169.5 19.32 47 7 25.90 178.5 26.98 
29&over 59 219.3 18.97 1742 17.16 45 . 19.36 175.2 19.80 





The tendency of individuals over twenty-nine years of age to perform 
more slowly than individuals under twenty-nine is not clear, possibly 
because of the inadequate sample. For both men and women, the 21-24 
year age group does better than the 17-20 year group. For men, on 
Turning, the trend is clear and consistent,—the 21-24 and 25-28 year 
age groups are progressively faster than the 17-20 group, while the group 
twenty-nine years of age and over is slower. However, for men, for 
Placing, and for women, for both Placing and Turning, this trend is not 
evident. 


Received March 6, 1943. 





Comparison of an “Industrial’’ Problem Solving 
Task and an Assembly Task * 


Ensign Jay T. Rusmore, U.S.N.R. 


This study was undertaken to compare the performance of men and 
women in two different tasks assumed to be representative of commonly 
occurring industrial tasks. Two hypotheses were investigated. The 
first was that there are no sex differences in the performance of spatial 
relation tasks. The second hypothesis was that there is no difference 
between a typical problem solving mechanical task and a typical repetitive 
mechanical assembly task. This study finds no evidence to disprove the 
first hypothesis and no evidence to support the second. 

Crawford in 1940 standardized a test of tridimensional structural 
visualization.! This test consists of nine parts which must be put to- 
gether against time to form a round block six inches in diameter. The 
test has been found to differentiate successfully among certain occupa- 
tions. Crawford’s standardization of his test places design draftsmen at 
the 92nd percentiic, detail draftsmen at the 80th, machinists at the 75th, 
mechanics at the 65th, and laborers at the 35th percentile. 

Crawford further found that second or third trials on the test did not 
differentiate the occupational groups as well as did the first trial. The 
writer’s study shows that the ability required for performance of the test 
after the original problem has been solved is different from that required 
for the first, or problem solving trial. The function measured by the first 
trial has already been labeled as tridimensional visual structuralization. 
It will be hereinafter designated T.V.S. The function measured by the 
tenth trial will be called the assembly function. This function is appar- 
ently similar to the multitude of industrial assembly jobs in which the 
variable most closely related to speed is dexterity of hand and finger. 


Experimental Procedure 


A preliminary experiment in which the Crawford test was administered 
27 successive times to fifteen subjects revealed that the mean times for 
the completion of the task did not materially decrease after the tenth trial. 


* This study was carried out at the University of California by the writer’s class in 
industrial psychology. Special recognition is due Mr. Albert Williams, who organized 
and administered the testing program. 

1 Crawford, John Edmund. A test for tridimensional structural visualization. A 
new test for mechanical insight designed primarily to measure ability or aptitude in 
drafting. J. appl. Psychol., 1940, 24, 482-492. 


129 








130 Jay T. Rusmore 


Ten repetitions of the test were then given to 87 women and 44 men, 
upper division students in a class of industrial psychology. The standard 
published instructions for the administration of the test were followed. 


Results 


The mean and standard deviation for each sex on each trial were 
computed. These measures, together with the significance of the differ- 
ences in mean performance of men and women, are given in Table 1. 











Table 1 
Time in Seconds Required to Solve the Crawford Tridimensional Structural 
Visualization Test 
Men Women 

N =44 N = 87 M 1-M Y 

Trial M.: 1 s.d M 2 s.d s.d. diff. 
1 117.5 57.7 135.4 96.6 15 
2 54.1 26.7 63.5 52.8 16 
3 42.0 24.3 43.1 19.5 03 
4 30.8 13.1 29.7 20.3 05 
5 28.3 9.7 34.9 26.6 20 
6 24.8 7.3 27.3 17.4 14 
7 23.6 6.4 27.1 16.9 13 
8 23.5 8.3 27.6 20.4 19 
i) 20.8 6.7 24.8 12.9 27 
10 21.1 7.0 22.3 6.3 24 





From the last column of Table 1, it will be seen that none of the critical 
ratios are very different from zero. This may be interpreted to mean that 
in no trial was there any significant difference between the mean perform- 
ance of men and women. 

To discover the relation between T.V.S. and the assembly function, 
correlations between initial and final success for both men and women 
were obtained. For men, the figure came to .09 + .15, clearly indicating 
no relationship. For women, a barely significant relationship, indicated 
by a coefficient of correlation of .36 + .12,wasfound. Thus, it might be 
held that there was, among women, a certain similarity between T.V.S. 
and assembly function. But, like the mountain and the molehill, they 
are quite different. 

The lack of correlation between the first and tenth administration of 
the Crawford test would be expected if the test were an intelligence test. 
The extreme limitation of the range of talent among the college population 
typically produces low correlations for test-retest of intelligence tests. 
The nature of the task presented by the Crawford test, however, appears 

















Comparison of an ‘Industrial’ Problem 131 


quite different from the language problems of the typical intelligence test. 
College students are not selected on the basis of any spatial relations func- 
tions and no limitation of the range of talent is to be expected. 

A comparison of Crawford’s sample and the college sample shows that 
the two populations are not very different from each other. The mean 
and s.d. for the industrial population was 153 seconds and 69 seconds; for 
the college population, 117 seconds and 58 seconds. There appears to be 
a small reduction of the range of talent in the college group, principally at 
the poorer performance end of the distributions. But the variation 
among the college scores is still large. The low correlations between 
first and tenth trials cannot be reasonably attributed to a reduction of 
range of talent. They must, therefore, be due to differences of the func- 
tions measured. 


Summary 


Success in the first trial of the Crawford Test of Tridimensional Struc- 
tural Visualization has been found to be related to success in jobs such as 
draftsman, machinist, and mechanic. For the purposes of this study, a 
tenth trial on the same test was called an assembly function, supposedly 
related to success on jobs requiring hand and finger dexterity. The first 
and tenth trial for the women correlated .36 + .12; for the men, .09 + .15. 
No sex differences in mean levels of performance were found for either 
function. 


Received May 12, 1943. 








Note on Use of Pre-Test Practice Periods by Typist-Clerks 


Morse P. Manson 
University of Southern California 


One hundred fifty unemployed women applicants for typist-clerk jobs, 
all claiming to have had typing training and experience, were given a ten 
minute practice period before the formal typing test. 

At the end of the ten minutes, the practice paper was exchanged for 
typing paper to be used in the test proper. The practice papers were then 
analyzed, with the results as shown in Table 1. 








Table 1 
Recapitulation of Material Typed 
Categories Frequency Percentage 

A Repetitive finger drills 170 43.8 
B Random, haphazard typing 81 20.9 
Cc Comments on testing situation 61 15.7 
D Personal information 28 7.2 
E Straight copy from printed or written 

material 18 4.6 
F Passages from memory (poems, speeches, 

songs) 17 44 
G Topical material 13 3.4 





388 100.0 





Summary and Conclusions 


1. Repetitive finger drills appeared in 43.8 per cent of the material 
typed. This type of practice, according to many instructors, is most 
systematic and practical in warming-up for a typing test. 

2. The other categories of material typed, 56.2 per cent, reveal a lack 
of controlled practice. In a great many cases, there appeared to be com- 
plete disorganization and waste of time. 

3. It is suggested that specific copy be prepared for practice purposes. 
This copy should be constructed to emphasize the typing copy given on the 
formal test. If the test contains statistical material, numbers and col- 
umns of figures should be included in the practice material. 

4. Considerable information of vocational significance appeared in 
many papers. Papers revealed fear, worry, anxiety, tension, dispersion of 


132 














Use of Pre-Test Practice Periods by Typist-Clerks 133 


ideas, creative writing, ability to concentrate, insight into the testing 
situation, and suggestions for improving the testing procedures. 

5. Several suggestions for the improvement of the testing procedures 
were: a. Noisy or faulty machines designated; b. Hasty monitors de- 
scribed; c. Delay in getting test under way. 

6. One definitely psychotic candidate was discovered. A check-up of 
her vocational history revealed chronic maladjustment. 

7. On the basis of the papers examined, it can be stated that many 
applicants for typing jobs do not use their practice period most effectively. 
In fact, many candidates, instead of limbering up their fingers, increase 
their fears and tensions to the extent that when the formal typing test is 
administered, they “freeze up” and fail to perform according to their 
abilities. Failure to develop and use systematic practice habits before a 
test may be a contributing factor to the onset of emotional blocks or 
“freezes.” 


Received May 7, 1943. 








Attitudes of Elementary School Children to School, 
Teachers and Classmates 


Samuel Tenenbaum 
Fort Hamilton High School, Brooklyn, N. Y. 


The purpose of this investigation was to determine to what extent 
attitudes expressed by children correlated with intelligence, achievement 
in school work, conduct and proficiency marks in school. 

The subjects were 639 sixth and seventh-grade children in three ele- 
mentary schools located in varying neighborhoods in New York City. 
One school was located in a superior residential section, another was in an 
average section, and the third was in a poor section. Because of the 
variety of neighborhoods, the sampling may be said to be as representa- 
tive as could be obtained considering the huge school population that 
exists in New York City and the limited means at the disposal of the 
author. 

To determine school attitudes, the investigator used his School Atti- 


tude Questionnaire Test. For the other variables, he used the Otis 
Classification Test and the school records of the subjects.” 

This paper will deal primarily with the investigator’s conclusions. 
Unfortunately, limitation of space makes it impossible to provide detailed 
data and procedures, but these can be found in the studies already 
enumerated. 


The Children’s Responses to the Attitude Questionnaire Test 


An analysis of /the subjects’ responses to the Attitude Questionnaire 
Test indicates that there is a considerable amount of dissatisfaction with 
the school situation. At least 20 per cent of the children, one out of five, 
are unhappy and maladjusted at school, and are ready to quit at any or 
no pretext. This “20 per cent group” shows up again and again as an 


1 A detailed description of the test, its reliability and validity may be found in the 
writer’s study, A test to measure a child’s attitude toward school, teachers, and class- 
mates. Educ. admin. Super., March, 1940. 

2 A more detailed description of procedure may be found in the writer’s study, A 
school attitude questionnaire test correlated with such variables as I.Q., E.Q., past and 
present grade marks, absence and grade progress. Educ. admin. Super., February, 1941. 
Also a study by the writer, Uncontrolled expressions of children’s attitude toward school. 
Elem. Sch. J., May, 1940. 


134 





Attitudes of Elementary School Children 135 


extreme group, and does not include other members who are highly critical 
of the school situation but do not express such bitter resentment. For 
example, 21 per cent of the children say they are sad at the thought of 
going toschool. Very frankly, 22.2 per cent of the group say they do not 
like school. About 22.9 per cent would rather work than go to school 
even if they didn’t need the money. An explanation of this attitude may 
be found in the fact that 23.3 per cent think that work is more fun than 
going to school. The explanation is further supported by the fact that 
21.5 per cent say that, if given the chance, they would take their working 
papers right away. When they do work, 28.4 per cent wish that their 
employer would be different from their teacher. 

And yet they seem to feel conscious of the importance of school for, 
despite these sentiments, only 13.2 per cent say they would quit school 
immediately, if they had their way. Apparently duty does not hold 
them strongly, as is evidenced by the fact that an additional 20.6 per cent 
would appease their conscience by going an additional year to school and 
then would quit. 

If they had all the money in the world, 9.5 per cent said that they 
would stop “right away and have a good time,” while 15.3 per cent would 
continue until graduation from public school. This gives us the approx- 
imately 20 per cent who are unhappy in school and who indicate, by their 
answers to the questionnaire, intense dislike and even hatred. 

That this “20 per cent group” is extreme and does not account for all 
those showing unfavorable attitudes toward school is indicated by the 
fact that 40.4 per cent would make school different and 43.9 would like 
the place where they work to be altogether different from school. 

The girls appear to be more favorably disposed toward school and their 
teachers than the boys. This superiority in favorable attitudes runs 
fairly consistently throughout the study. It suggests the possibility that 
school as at present constituted is perhaps better fitted for girls than boys. 

The responses to the twenty items in the Attitude Questionnaire Test 
indicate in unmistakable fashion that sixth-grade boys and girls harbor 
more favorable attitudes to many phases of the school situation than do 
seventh-grade boys and girls. 

When it comes to subject matter, the children express a high regard for 
the value of what they learn in school. Only 3.3 per cent say they learn 
less in school than in any other place and only 9 per cent say that what 
they learn in school will not help them. Of course, the response may be 
due to the fact that they have been taught to associate school with learn- 
ing and that they know of no other kind of learning. 

The children indicate marked liking for their teachers, and in this 
respect, too, the girls’ attitudes are more favorable. This favorable atti- 





136 Samuel Tenenbaum 


tude is markedly greater than for the school situation itself. A little over 
8 per cent express dislike of their present teachers, and a little over 6 
per cent express dislike for teachers as a group. 

The children indicate highly favorable attitudes toward their class- 
mates. About 7 to8 per cent of the children express unfavorable attitudes 
in this regard. 


Free Expression of Attitude Toward School 


In another portion of the study, the children were asked to write down 
honestly what they would say to a friend who asked them what they 
thought about school. They were assured that what they wrote would 
be anonymous and could not be traced back to them. These free ex- 
pressions of school attitudes then were tabulated. At the outset, it be- 
came clear that the children do not regard school with enthusiasm or 
pleasure. By and large, they feel it is a task to be done and a task which 
they think should yield them rich returns. 

This is brought out by the fact that “fun” and “enjoyable” is men- 
tioned in connection with school 14 times by the girls and 28 times by the 
boys. However, “getting an education” and “learning things’ are 
mentioned 61 times by the boys and 100 times by the girls. The fact that 
education will prove helpful in making money and getting on in the world 
is mentioned 26 times by the boys and 43 times by the girls. 

The tabulated responses revealed that 48.6 per cent of the boys ex- 
pressed a definite liking for school, while 69.0 per cent of the girls re- 
sponded favorably. (See Table 1.) 











Table 1 
Distribution of Responses of 290 Boys and 290 Girls to Question 
ether They Liked School 
Boys Girls Both 

Response Number PerCent Number PerCent Number Per Cent 
Like School 141 48.6 200 69.0 341 58.8 
Dislike School 69 23.8 30 10.3 99 17.1 
Mixed Emotions* 80 27.6 60 20.7 140 24.1 

Total 290 100.0 290 100.0 580 100.0 





* Listed in this category are responses which indicate mixed emotions and feelings. 
Example of one such response: “Sometimes it’s (school) all right, but at other times I 
hate it because my teacher gets cranky.” 


The girls appear to be more serious about school, more conscious of its 
value as a preparation for life. They mention “education” and “getting 
ahead in the world” in connection with school 100 times, while the boys do 




















Attitudes of Elementary School Children 137 


so only 61 times. Even when it comes to such practical considerations— 
viewing education as a help in getting a job—the girls mention these 
considerations 43 times while the boys do so only 26 times. 

The girls mention the teacher in a favorable way 49 times, while the 
same number of boys do so only 19 times. 

The children who said they did not like school mentioned the teacher 
most frequently as the reason why they dislike the institution. The 
teacher is mentioned in connection with such terms as “too strict,” 
“fnsulting,” “unfriendly,” by sixteen boys and girls. The relationship 
between teacher and child appears at times to be anything but cordial. 

The children’s reasons for liking or disliking school are mostly of an 
adult type, displaying adult thinking and feeling. As one child naively 
puts it: “I like it (school) a little bit. If I didn’t go to school, I would be 
a garbage man. I want to be a boss and be paid well without doing 
much work.” This general view recurs frequently as a factor associated 
with attitudes toward school. 


Attitudes Correlated With Variables 


Eight variables—I Q, EQ, Days Absent, Past Proficiency Marks, 
Past Conduct Marks, Present Proficiency Marks, Present Conduct Marks 
and Grade Progress—were correlated with the composite score made on 
the School Attitude Questionnaire. For six variables the relationship was 
positive. For two, E Q, and Days Absent, it was negative. The nega- 
tive relationship was so low, the coefficients of correlation being respec- 
tively —.04 and —.02, that it had no significance. The highest positive 
correlations were those for past and present conduct marks, the correla- 
tion for both traits being .23. The other coefficients were considerably 
lower, ranging from .13 to .003. The coefficients of correlation were so 
low that there appears to be no definite or marked association between 
any variable and favorable response to the entire Questionnaire. 

The eight variables were then correlated with the battery of tests 
measuring attitude toward school. All variables correlated positively, 
except days absent, the negative correlation being —.04. The positive 
correlations ranged from a high of .19 to a low of .01, with the average 
about .10. The association between the variables and the scores made 
on the battery of tests measuring attitude toward school was so low as to 
have no prognostic or predictive value. 

Seven of the variables correlated positively with the battery of tests 
measuring attitude toward teacher. The only variable which indicated 
negative correlation with this aspect of attitude was Days Absent, the 
relationship between the two variables being —.05. The other variables 
were positively correlated. The correlations ranged from a high of .19 











138 Samuel Tenenbaum 


to .07. However, the positive relationships were so low as to indicate no 
marked or definite association. 

The correlation between the eight variables and the battery of tests 
measuring attitude toward classmates showed six positive and two nega- 
tive relationships. None of the correlations—negative or positive—had 
any significance. The highest correlation was .10 and the lowest was .01, 
with the average about .05. It would appear that there is no marked 
association with a child’s attitude toward his classmates and the posses- 
sion or the lack of possesson of the investigated variables. 

Fourteen items comprised the battery of tests measuring attitude 
toward school. The children making scores above the third quartile in 
this battery of tests were compared with the children making scores less 
than the lowest quartile. It was discovered that the children above the 
upper quartile—those expressing the most favorable attitudes—had a 
higher standing in the seven variables under investigation. Only in one 
variable—attendance—was there no reliable difference between the two 
groups. However, even for this variable, whatever difference did exist 
was in favor of the upper quartile group. The critical ratios of the differ- 
ence ranged from 5.43 to 3.36. However, the extensive overlapping of 
the two extreme groups in all eight variables precludes the belief that there 
is a sharp division between the two quartile groups or that the possession 
of large amounts of positive aspects of these traits are highly or inevitably 
associated with favorable or unfavorable attitudes. At the most, they 
can only be regarded as a contributory influence. 


Attitude of Problem Children Compared With Average Child 


Teachers were asked to identify children whom they would classify as 
“problems.” This group included 40 boys, and their responses to the 
Attitude Questionnaire were compared to the 639 children included in this 
investigation and described below as the “‘total group.”” The “‘problem”’ 
child expresses a considerably higher degree of unfavorable attitude 
toward school, teachers and classmates than does the average child. In 
thirteen of the fourteen items determining attitude toward school, the 
“problem” child indicates greater dislike for the school situation. For 
instance, 37.5 per cent of the problem group say they are hardly ever or are 
never happy at school, which compares with 13.8 for the total group. 
About 38 per cent of the problem group and about 12 per cent of the total 
group would like to stop school right away. These are typical of the 
differences which were found for most of the other items. The differences 
between the responses for the problem group and the total group are 
reliable for ten of the items. For three items, whatever differences exist 
is in favor of the total group. Two of these three items concern the value 





Attitudes of Elementary School Children 139 


of subject matter learned in school. Both the problem and the normal 
group appear to regard this subject matter as valuable. Only to one 
item, which asks the child whether he would care to go to school the rest 
of his life, does the problem group show even a slightly more favorable 
response than the total group. 

Of the four items which tested attitude toward teachers, the problem 
group expressed less favorable attitudes in all items. A little over 13 
per cent of the total group express “‘hate’”’ and “‘dislike’’ of their present 
teacher, which compares with 20 per cent for the problem group. Fifty 
per cent of the problem group, as compared with 37.1 per cent of the nor- 
mal group, would want the “man for whom I work”’ to be different from 
their teacher. Thirty-five per cent of the problem group noted that they 
liked “hardly any” or “none” of their teachers, which compares with less 
than 10 per cent of the normal group. Three of the four items testing 
attitude toward teachers show reliable differences between the responses of 
the two groups. The only item not showing a reliable difference has a 
C. R. = 2.6. 

The two items testing attitude toward classmates elicited less favor- 
able response from the problem group than from the total group. Ap- 
proximately 17.5 per cent of the problem group and 6.4 per cent of the 
total group admitted liking “hardly any” or “none” of the children of 


their class. Likewise, 12.5 per cent of the problem group reported that 
the children at school aren’t their friends and that they didn’t like to play 
with them. This compares with 8.4 per cent for the total group. The 
C. R. for the first item was 4.39, which makes it reliable, and for the 
second item, the C. R. = 2.28, which indicates high probability, although 
not altogether reliable. 


Conclusions 


At least 20 per cent of the children participating in this study express 
dissatisfaction with the school situation and about 40 per cent are highly 
critical of many phases of the educational process. 

The girls express more favorable attitudes toward school than do the 
boys. They appear to regard school more seriously and are more con- 
scious of the school’s value as a preparation for life. 

The children express a high regard for the value of the school curriculum. 

About 7 to 8 per cent express unfavorable attitudes toward their 
classmates. 

The teachers appear to be more popular than the school situation it- 
self, although, among the group disliking school, they are mentioned most 
frequently as the cause. The girls appear to like their teachers better 
than the boys. 











140 Samuel Tenenbaum 


Intelligence, achievement in school subjects, past conduct marks, past 
proficiency marks, present conduct marks, present proficiency marks, 
retardation, and amount of absence are not highly correlated with the 
child’s attitude toward school, teacher and classmates. 

Toward all aspects of the school situation with which this investigation 
has concerned itself the child called a “problem” by his teacher expresses 
markedly more unfavorable attitudes than the total group. 


Discussion 

The responses of the children indicate that 20 per cent would prefer to 
work rather than go to school, no matter how much money they had. 
Hence, it may be inferred that it is not only financial reasons which prompt 
a child to leave school, but that for many the school situation is unpleasant 
and that they would welcome any opportunity to sever connections with 
the institution, even if provided for financially. 

Although there is evidence that the child has not gained any benefit 
from the kind of instruction which is generally expected of the school, the 
child assumes that the school educates him. This is illustrated by the 7B 
child who writes: ‘Yes, I like it because you learn a good education and 
when you get out you could get a job.” Another 7B child writes: “I like 
school because you lern a lot and at P. T. you get a lot of exercise.”’ 

This investigation does not support the theory, held by many psychol- 
ogists and mental hygienists, that failure is inevitably associated with 
resentment and “‘hates.’”” Those children who failed and did poor work 
in school did not express any notable differences in school attitudes than 
the bright and the accelerated children. The picture of the failing child 
as a potential delinquent and a maladjusted individual is not borne out 
by this study. 

If school failure were concomitant with resentment, hate and anti- 
social attitudes, society would always have a known group of antagonists 
against which it would have to guard itself. Every child low in intelli- 
gence would be a potential delinquent and social menace. Not only 
would this be a most unfair provision of nature, condemning an innocent 
human being from birth, but it would be at the same time a grave problem 
for society. 

The study suggests that society provides the child with a sense of 
right and wrong, with values, traditions and customs. Because the child 
is born plastic and pliable, he quickly adopts the attitudes and the outlook 
of the community in which he lives. Since the school is an institution in 
the community, assigned by the community to do a definite task, the 
child takes it for granted that the institution is doing the task. He is not 
critical of the institution; he accepts it. This attitude does not make him 





Attitudes of Elementary School Children 141 


happy about being a member of the institution. He may be very un- 
happy within its environs, but, nevertheless, he thinks that the institution 
is good and desirable and serves worthy ends. The school, it would seem, 
is a receiver of attitudes, not a creator of them. The child comes to school 
with preconceived notions of how to regard school and tries to get and 
thinks he gets from school what the community expects the school to give. 


Received April 15, 1943. 








An Evaluation of Word and Picture Tests for 
First and Second Grades * 


Freda Poston and James R. Patrick 
Ohio University 


For the last twenty years, authors of reading tests for young children 
have used combinations of words and pictures in attempts to measure 
many different aspects of the reading process. An examination of sev- 
eral of these word-picture tests reveals that they are expected to measure 
such varied abilities as word recognition, word meaning, ability to follow 
directions, phrase recognition, and comprehension of phrases, sentences, 
and paragraphs. Yet there seems to be very little experimental evidence 
concerning the role which pictures actually play in reading except for a 
few studies of comprehension of textbook material with and without 
pictures. Even here the evidence available is contradictory. Miller 
concluded, after an experiment with six hundred children in grades one 
to three, that those reading from a modern illustrated textbook compre- 
hended no better than those using the same books with the pictures 
covered.' Pleitz, on the other hand, after a similar experiment with 90 
first grade children, using three primer stories with and without pictures, 
claims to have found significant differences favoring the use of illus- 
trations.” 

There seems to be a conspicuous lack of evidence concerning the role 
of pictures in relation to words when used in standardized word tests for 
children. The task of matching single words with pictures is found in 
tests called word, recognition,’ word meaning,‘ or simply word picture 
tests.5 Other tests of word recognition and word meaning are made 


* Part of a paper submitted by Freda Poston for master’s degree at Ohio University. 
Work done under the supervision of Prof. James R. Patrick. 

1 William A. Miller, Reading with and without pictures, Elem. Sch. J., p. 676, May, 
1938. 
2 Janet Pleitz, The educational value of pictures in first grade reading. Ohio Univer- 
sity Thesis, August, 1932. 

* Arthur I. Gates, Gates primary reading test, Type I, Word recognition, New York: 
Bureau of Publications, Teachers College, Columbia University, 1926-1936. 

4 Albert G. Reilley, Primary reading test, Part II, Meaning of words, Boston: Hough- 
ton Mifflin Company, 1939. 

5 Gertrude Hildreth, Metropolitan achievement tests, Yonkers-on-Hudson, New York: 
World Book Company, 1940. 


142 








Evaluation of Word and Picture Tests 143 


without pictures. Due to lack of adequate definitions of word recogni- 
tion and word meaning, widely different tests are published under the 
same labels, when the authors themselves probably had different ideas 
of what they were trying to measure. Gates calls his word recognition 
test, made with words and pictures together, a test of “ability to recog- 
nize and pronounce words singly.””’ This seems to imply that he expects 
results similar to those secured by asking a child to pronounce isolated 
words without pictures. However, he states that in exercises similar to 
those in his tests “‘an appraisal of the picture is utilized in the perception 
of the word,” implying that the picture suggests to the child either the 
meaning of, or the form of the word, or both. 

In administering the Gates test, the child is asked to look first at the 
picture and then select the appropriate word. The Detroit word recog- 
nition test, on the contrary, requires the child to look first at the word, 
and then, having recognized the word, select the appropriate picture.® 

Reilley uses a word picture matching technique similar to that of 
Gates, but calls his test one of ‘‘meaning of words.” He also has a test 
of “recognition of word forms,” but this is made without pictures. Evi- 
dently this author believes that the matching of words and pictures is 
primarily a measure of word meaning rather than word recognition.® 
Yet the author of the Metropolitan tests employs tests without pictures 
to measure both word recognition and word meaning, and calls her test 
containing both words and pictures simply a “word picture” test.” 

Three different techniques have been used in attempts to measure 
word recognition. One is the method of pointing to a word and asking 
the child to name it. The Jota Word Testis anexample." Certain other 
tests require the child to identify and mark from a series of words the 
one word named by the teacher, as in Test 2 of the Metropolitan Achieve- 
ment Tests. The third and most widely used technique is the matching 
of words and pictures. A series of pictures is presented, accompanied 
by a list of words, and the child is asked to draw a line connecting each 
picture with the word which bears some relation to it. In one procedure 


* Part I of Reilley’s Primary reading test, and Parts 2 and 3 of Metropolitan achieve- 
ment tests are examples. 

7 Arthur I. Gates, Manual of directions for Gates primary reading tests, New York: 
Bureau of Publications, Teachers College, 1935. Also, Arthur I. Gates, The improve- 
ment of reading, New York: The Macmillan Company. 

* Eliza F. Oglesby, Detroit word recognition test, Yonkers-on-Hudson, New York: 
World Book Company, 1925. 

* Albert G. Reilley, op. cit. 

1° Gertrude Hildreth, op. cit. 

™ Marion Monroe, Methods for diagnosis and treatment of cases of reading dis- 
ability, Gen. Psychol. Monog., IV, pp. 4-5, October, 1928. 








144 Freda Poston and James R. Patrick 


the word is suggested by pointing to it; in another it seems that the voice 
is used to suggest the word; in still another, the picture. 

Little attempt has been made to evaluate the three techniques by 
using children’s responses to the various items in the tests. Primary 
reading tests, particularly those intended for groups, differ markedly 
from intelligence tests in this respect. For the Binet test of intelligence 
no items were included which were not correctly answered by from sixty 
to seventy-five per cent of the children of a given age. Most of the more 
recently published intelligence tests are validated by correlation of results 
with those of the Stanford Revision of the Binet. The term “reading 
age’ is of relatively less value in its use than the term “mental age,” 
since the latter is based on what children actually do with the different 
items which make up the tests, whereas in the former there seems to be 
little agreement of concepts underlying the measures. 

The fact that tests in common use have not been subjected to an 
analysis based on children’s responses to each item; and the use of tests 
with and without pictures under similar labels with lack of evidence to 
show that they produce essentially the same results, point to a need for 
more research concerning the materials and techniques used in test 
construction. 


Problem and Method 


The purpose of the present study is to evaluate techniques of word 
and picture matching employed in three different tests of word recogni- 
tion and word meaning. This evaluation is based on responses of one 
hundred children to the specific items of each of the tests. Each item 
in each test was presented individually to each child by means of the 
three techniques heretofore mentioned which are commonly used in word 
tests. Children’s responses to materials containing words and pictures 
together, compared with their responses to the same words presented 
according to each of the two methods of testing without pictures should 
help to determine whether tests with and those without pictures yield 
the same results. If results are different, the difference in responses might 
show whether the test was for the children one of word meaning, or of 
picture interpretation, rather than a measure of recognition of word 
forms alone. 

Subjects. One hundred unselected first and second grade children 
were used as subjects. There were twenty-six first grade girls, twenty- 
four first grade boys, twenty-four second grade girls, and twenty-six 
second grade boys, making fifty children of either sex and fifty in each 
grade. 








Evaluation of Word and Picture Tests 145 


Materials. The test materials used were the Gates * and the Man- 
willer * word recognition tests and Reilley’s word meaning test.'* The 
Gates test contains forty-eight items each consisting of a picture and 
four words. The child is asked to look first at the picture and then 
select the appropriate word. In the Manwiller test each item has one 
word and four pictures, from which the child selects the appropriate 
picture. The Reilley test is similar in construction to the Gates, in that 
each item contains a picture and four words from which the word relating 
to the picture is selected. The three tests contain ninety-three different 
items each requiring identification of words in relation to pictures. 

Procedure. The items of each of the three tests were presented under 
the following conditions: (A) matching of words and pictures; (B) identi- 
fying the same words without pictures when pointed to by the experi- 
menter; (C) selecting the correct word, after hearing it pronounced, from 
a group of four words; (D) speaking the first word thought of when 
pictures are seen alone. 

To eliminate the effect of practice, subjects were divided into two 
groups, and materials were presented in reverse order to Group II. The 
groups were equated for age, mental age, IQ, sex, school grade, reading, 
and ability to do the various kinds of work given in the experiment." 

The test materials used in-the experiment, although taken from group 
tests, were administered individually. Results obtained with our sub- 
jects, compared with norms for the group tests, would be a means of 
checking procedures to be used in presenting the same items without 
pictures, since these particular tests have never been standardized with- 
out pictures. Since condition B (speaking the words without pictures 
when pointed to by the experimenter) required oral pronunciation of the 
words, it was necessary to give this part of the experiment as an individual 
test. By also presenting the items individually under each of the other 
three conditions, results of the four conditions would be more comparable. 


” Arthur I. Gates, Gates primary reading test, Type I, Word recognition, Form I, 
New York: Bureau of Publications, Teachers College, 1926. 

4% Charles E. Manwiller, Manwiller word recognition test, Form A, New York: World 
Book Company, 1934. 

“4 Albert G. Reilley, Primary reading test, Form A, Part II, Meaning of words, Bos- 
ton: Houghton Mifflin Company, 1939. 

% The basis of equating for mental age and IQ was the administration of the Stan- 
ford Revision of the Binet test to each child, individually, by the experimenter. Subjects 
were matched in reading by their scores on the Gray Oral Reading Check Tests, and 
matched in ability to do the kinds of work in the experiment by scores on Tests 1, 2, 
and 3 of Metropolitan Achievement Tests. 





146 Freda Poston and James R. Patrick 


Results 


Test scores of our subjects were first compared with the norms for 
age and grade levels of these children, and show that, even though these 
were intended as group tests, giving them individually did not alter the 
results. For example, the expected grade score '* for our group on the 
Gates test was 2.3, whereas the actual median score was 2.37. Results 
for the other two tests showed an equally close agreement between actual 
scores of our group and the expected scores according to the norms. 
Further, a comparison of scores for the two equated groups showed that 
very little practice effect resulted from presenting the test materials 
under four different conditions. Therefore, scores of the two groups were 
combined. 





Table 1 
Percentages of Correct Responses Under Conditions A, B, C, and D * 
A B C D 

Gates Test, 48 items 

100 Pupils 

Possible Correct Responses 4800 4800 4800 4800 

Number Correct 3297 2603 3410 2163 

Percentages Correct 68.68 54.22 71.04 45.06 
Manwiller Test, 25 items 

100 Pupils 

Possible Correct Responses 2500 2500 2500 2500 

Number Correct 2268 2059 2275 1684 

Percentages Correct 90.72 83.56 91.00 67.36 
Reilley Test, 20 items 

100 Pupils 

Possible Correct Responses 2000 2000 2000 2000 

Number Correct 1608 1288 1778 1195 

Percentages Correct 80.40 64.40 88.90 59.75 
Total for Three Tests, 93 items 

100 Pupils 

Possible Correct Responses 9300 9300 9300 9300 

Number Correct 7173 5980 7463 5042 

Percentages Correct 77.12 64.30 80.24 51.65 

* A: Correct identification of word in re- C: Selecting correct word after hearing 
lation to picture. it pronounced. 
B: Pronouncing word correctly with- D: Words called forth by pictures 
out picture. alone. 





Analysis of data in Table 1 brings out the fact that 12.8 per cent 
more correct responses were made under condition A, words and pictures 
together, than to words alone when these words are indicated by pointing, 


16 A grade score refers to the median score for pupils at a given age and grade level. 
Norms are given in terms of grade scores. 





Evaluation of Word and Picture Tests 147 


condition B. This suggests that presentation of word and picture to- 
gether facilitated the recognition of the correct word. However, for 
certain items in which the pictures were misinterpreted by the pupils, 
the presence of the picture seemed to retard rather than facilitate the 
finding of the correct word. 

Responses of different children to the same pictures indicated that 
many of the pictures were ambiguous. For example, the eleventh item 
of the Manwiller test, giving mouth as the correct word, shows a picture 
intended to represent a mouth. When the picture was presented alone 
it was called nest, cocoon, milkweed pod, lips, cushion, bean, shell, beet, 
and peanut by different pupils. Responses to pictures of the Gates test 
reveal even greater ambiguity. For example, one item consists of the 
word chalk, accompanied by a picture intended to represent a box of 
chalk. When pictures were presented alone, only twelve of the hundred 
children said chalk or box of chalk. Forty-eight said box, while others 
called it cigarettes, candy, cigars, box of matches, box of milk bottles, 
popgun shells, battery, sticks, crayons, pencils, box of wood, money, 
bullets, and paints. Many other examples could be cited showing the 
children’s difficulty in interpreting the pictures. A large number of these 
pictures could easily call forth a great variety of responses even by 
adults.!7 Obviously, the pictures of these tests, in many instances, failed 
to suggest to the children the words intended by the authors of the tests. 

In general, those items showing fewest children responding correctly 
to pictures alone were also low in number responding correctly when 
words and pictures were presented together. The fact that they are 
lower in both suggests that the nature of the picture is related to the 
recognition of the correct word. For some items containing pictures 
which did not immiediately call forth the correct word, children were 
able after seeing both word and picture, to perceive the correct relation- 
ship between the two. It was seen, however, that such items required 
a greater amount of time than those in which the picture suggested the 
correct word. For example, the item in the Gates test containing a 
picture of a girl running usually brought the response girl, rather than 
the correct word run when the pictures alone were seen. By comparing 
the time required for this item (Second item, Gates test) with that re- 
quired for one in which the picture was rarely misinterpreted (First item, 
Gates test; word boy and picture of a boy), it was found that the children 
could relate the word boy to the picture of a boy in from three to six 
seconds, but required from ten to twelve seconds to relate the word run 


17 Some interesting data were secured by presenting these pictures alone to a group 
of college students and one of primary teachers. For some of the pictures, these adults 
were seen to vary in responses quite as much as the children. 





a 

















148 Freda Poston and James R. Patrick 
Table 2 
Number of Children Making Correct Responses to Each Test Item Under 
Conditions A, B, C, and D (N = 100 pupils) 
Gates Test 
Item A B C D Item A B Cc D 
1. boy 99 98 99 77 25. hair 53 49 57 53 
2. run 92 93 98 ll 26. stand 63 43 67 3 
3. hen 96 88 97 46 27. goat 83 75 84 35 
4. sit 65 66 68 0 28. hide 67 49 79 11 
5. king 91 68 86 70 29. crow 75 51 76 34 
6. top 93 71 89 86 30. soup 54 35 61 48 
7. hand 81 60 90 96 31. pick 56 38 61 4 
8. bow 67 33 75 37 32. window 90 78 95 86 
9. men 85 67 94 54 33. shop 55 33 49 2 
10. four 88 85 84 95 34. wheat 57 40 66 18 
11. hay 68 53 80 51 35. throw 57 25 61 5 
12. barn 83 81 91 49 36. leaf 40 28 40 99 
13. bear 92 67 93 74 37. bank 45 35 56 62 
14. water 82 71 88 45 38. wood 85 72 92 38 
15. sleep 73 60 80 16 39. smile 52 33 64 0 
16. face 59 52 68 18 40. lake 62 40 74 6 
17. lie 46 28 51 3 41. light 75 50 81 78 
18. flag 79 65 85 94 42. mail 40 39 53 13 
19. store 66 73 68 86 43. cover 27 30 48 0 
20. farmer 88 76 91 45 44. roof 58 49 72 38 
21. rain 73 68 83 51 45. chalk 29 14 57 12 
22. clock 83 66 75 97 46. lily 46 14 47 15 
23. blow 65 47 70 14 47. cock 43 25 53 2 
24. walking 74 69 88 12 48. drive 53 39 55 3 
Manwiller Test 
Item A B C D Item A B Cc D 

1. bed 88 83 89 100 14. rabbit 98 95 100 93 
2. big 96 92 92 18 15. ear 84 51 68 63 
3. bird 97 91 97 95 16. eye 89 72 88 96 
4. boy 100 98 99 99 17. feet 86 70 82 98 
5. bread 79 66 OF 99 18. hen 98 89 96 66 
6. cat 95 92 93 76 19. eat 83 64 83 3 
7. cow 99 95 98 100 20. house 82 91 93 97 
8. dog 98 98 96 96 21. jump 9+ 97 100 11 
9. man 99 95 95 93 22, snow 86 76 90 8 
10. mouse 84 69 77 77 23. tree 98 95 9 100 
11. mouth 56. 45 58 18 24. run 99 90 99 8 
12. one 98 95 99 39 25. in - 93 95 98 16 
13. pig 97 94 92 99 

















Evaluation of Word and Picture Tests 149 








Table 2—Continued 

Reilley Test 
Item A B Cc D Item A B Cc D 
1. books 97 77 96 80 11. sheep 90 60 93 76 
2. money 78, 47 99 32 12. shoes 85 64 85 99 
3. eggs 95 93 96 32 13. people 81 69 93 27 
4. store 93 78 94 82 14. church 85 35 93 75 
5. basket 99 77 98 100 15. fish 80 73 85 98 
6. sticks 80 50 76 83 16. eating 80 71 90 41 
7. hands 87 66 92 96 17. talking 72 54 92 23 
8. garden 88 83 97 51 18. face 59 52 77 86 
9. clock 90 66 95 99 19. ship 51 37 66 54 
10. door 77 77 94 10 20. pair 39 45 64 9 





to the picture of a girl running. Data in Table 2, items 1 and 2 of 
Gates test, show that the words boy and run were almost equally well 
known when words were presented alone, under conditions B and C. 
Therefore, the difference in time required for the two items must have 
been due, at least in part, to the nature of the pictures used.'* The fact 
that more time was required for those items in which pictures alone were 
often misinterpreted suggests that here the presence of the picture re- 
tarded rather than facilitated the finding of the correct word. 

Having seen that the pictures included in the tests are apparently 
very important in determining the child’s response to words and pictures 
together, we now come to that part of the experiment dealing with the 
particular words used. This study brings out the fact that, for most 
items, a slightly greater number of children responded correctly to the 
combination of the printed word with the spoken word, condition C, than 
to the word-picture combinations, condition A. When the clue to recog- 
nition is the spoken word, as in condition C, the particular words from 
which the correct one is to be selected are very important in deter- 
mining the child’s choice of words. An interesting example is seen in the 
case of the word clock which appeared in both the Gates and the Reilley 
tests with the picture of a clock. Children rarely failed to identify these 
pictures. In the Gates test, where the other words used in the item were 
chalk, clean, and block, seventy-five children chose the correct word clock, 


18 An interesting topic for further research might be the effect of the order in which 
the different test items are presented. For this study they were used in the order given 
in the test, as one of our purposes was to find out some of the difficulties of children 
with the tests as now published. It is possible that some of the difficulty in relating 
run and girl running may have been caused by ‘the fact it immediately followed the item 
boy and picture of boy, this first item having suggested to the child the idea of relating 
nouns rather than verbs to the pictures used. 








150 Freda Poston and James R. Patrick 


while fourteen selected the word chalk. In the Reilley test the word 
clock appears with the words still, street, and sand. In this setting, 
ninety-five of the hundred children were able to select the correct word. 
The degree of similarity encountered in abstracting one element from a 
group has long been known to be a factor in determining difficulty, but 
evidently test makers vary greatly in the extent to which they utilize 
this fact. 

An analysis of data in Table 2.shows that there were large differences 
in the number of children responding correctly to the various items of 
each of the tests. For example, in the Gates test, words and pictures 
together, Condition A, only twenty-seven of the hundred children an- 
swered one item correctly (item 43, Table 2, word cover and picture of 
a pan with a cover), while another item (item 1, word boy and picture 
of a boy) was correctly answered by as many as ninety-nine. Numbers 
of children responding correctly to different items of the Manwiller test 
varied from fifty-six to one-hundred, while for the Reilley test the varia- 
tion was from thirty-nine to one-hundred. It has long been an accepted 
principle of test construction that some items of a test shall be so difficult 
that children will not make perfect scores. However, it is to be expected 
that items be graded in difficulty, so that the difficult items for one grade 
are those which children of the next higher grade could answer correctly. 
Tests which have not undergone an item analysis have no adequate basis 
for the gradation of difficulty in items. The standardization of a test 
by means of group scores masks the difficulty of the various items. 

Selection of the test vocabulary from approved word lists is only a 
partial solution to the problem of determining difficulty of items for a 
particular age and grade level. There is some evidence to show that 
there is considerable variation between the vocabularies of different first 
grade readers, even though these books are supposed to be based on the 
same word lists.‘* The limited number of words in the tests used in this 
investigation, when selected from word lists of from five hundred to one 
thousand words, would permit much greater difference between the test 
vocabularies and those of primary readers than that found between 
different readers. 

The differences found among the three tests presented under each of 
the four conditions suggest that the selection of both the pictures and 
the words to be included in the test items, as well as the procedures used 
in presenting them to the pupils, are very important in determining the 
number of correct responses made, and therefore need further analysis. 


1° Hockett, John A., and Neely, N. Glen, The vocabularies of twenty-eight First 
Readers, Elem. Sch. J., January, 1937, pp. 344-352. See also Rudsill, Mabel, Selection 
of Pre-Primers and Primers, a vocabulary analysis, Elem. Sch. J., May, 1938, pp. 683- 
693. 








Evaluation of Word and Picture Tests 


Summary and Generalizations 


Although tests of reading abilities in first and second grades have been 
used for the last twenty years, very little attempt has ever been made to 
evaluate the tests themselves by using children’s responses to the various 
items. The purpose of this study is to evaluate some of the techniques 
employed in the development of three tests intended to measure word 
recognition and word meaning by presenting the specific items of each 
test individually to one hundred children. 

The test materials from Gates, Manwiller, and Reilley were presented 
under the following conditions: (A) identification of words in relation to 
pictures; (B) identification of the same words without pictures when 
pointed to by the experimenter; (C) selecting from a group of four words 
the one orally pronounced by the experimenter; and (D) stating the first 
word thought of when the pictures alone are seen. 

Results of the experiment showed that: (1) in general, more correct 
answers were given when words and pictures were seen together than 
when either was presented alone, but that for certain items in which 
pictures were misinterpreted by the children the picture seemed to retard 
rather than to facilitate the correct response; (2) words were more often 
correctly recognized when seen with other words not similar in form; 
and (3) a great amount of variability was seen both in responses of 
different children to the same test items, and in responses of the same 
children to different items. 

The fact that there were sometimes extremely large differences be- 
tween the responses under the different procedures used suggests that 
probably tests with and those without pictures should not be considered 
equally diagnostic. . Whether the ability being measured was recognition 
of word form, or word meaning, or both, however, could not be definitely 
determined by this experiment. A study of the children’s errors reveals 
instances in which there seems to be confusion between two words similar 
in form, as horse and house. In certain other instances, for example, the 
word chicken pronounced instead of the correct word hen, the meaning 
seems to have been the determining factor. For the larger number of 
cases, however, in which the words were correctly pronounced, it was 
impossible to determine what meanings were being associated with the 
words. Much research needs to be done on problems of meanings asso- 
ciated with different words and pictures by children, and the situations 
in which either the form or the meaning of a word is recognized without 
the other, before tests can be adequately evaluated on this basis. Fur- 
ther, the meaning of the pattern of words alone, and pictures alone, and 
the total pattern with a combination of these words and pictures together 
needs to be investigated. Children are learning to read from books in 








152 Freda Poston and James R. Patrick 





which words and pictures occur together. If the skill learned under this 
condition is being measured in these early stages without pictures, the 
distortion of the word-picture pattern may be such that the test fails to 
measure the child’s ability as it functions in a situation in which both 
words and pictures are a part of the total pattern. 

The results of the experiment also bring out the fact that teachers 
should exercise extreme care in the use of such “tests’’ for classification 
purposes. To place equal emphasis on evaluation of children’s responses 
by tests which show such great discrepancy between items and between 
procedures used in testing is obviously unfair to the children. The fact 
that many pictures were often misinterpreted, and the differences in 
numbers of correct responses under the different conditions point to a 
need for revision of tests with much greater care in the selection of both 
the items and the procedures used. 

Since, as has been mentioned, most children are learning to read in 
an atmosphere in which pictures are a part of the total reading situation, 
this suggests that tests with pictures, if carefully made, might be impor- 
tant elements in the early development of reading habits, although these 
habits might retard reading later when pictures do not always accompany 
the words. If pupils are to learn reading through functional experiences, 
with a minimum of formal drill, better diagnostic measures are needed. 
Children cannot be expected to read independently either for enjoyment 
or for information relating to problems, without an adequate working 
vocabulary. More carefully made instruments, better related to the 
kinds of materials used by children, are needed in order that teachers 
may be better guides to further reading experiences. 

It is to be understood that these tests are intended to measure only 
a small part of the different abilities actually involved in reading. This 
study should not be considered to be an attempt to over-emphasize word 
recognition and word meaning to the exclusion of other processes in 
reading. It is merely an attempt to begin an evaluation of tests and 
procedures in a field in which little has been attempted. Results of this 
experiment show a need for further research on problems relating to 
instruments used in evaluation of children’s performance while learning 
is in process. 





The Minnesota Multiphasic Personality Inventory 
V. Hysteria, Hypomania and Psychopathic 
| Deviate * 


J. C. McKinley, M.D., and S. R. Hathaway 
Department of Neuropsychiatry and the Department of Psychology University of Minnesota 


I. Introduction 


The basic plan of approach in the development of the Minnesota 
Multiphasic Personality Inventory has been presented elsewhere (1). 
Three scales have been described as developed according to that plan (2) 
(3) (4) and the present paper will present some salient points regarding 
three more scales for the detection respectively of hypomania, psycho- 
pathic deviation and hysteria. These scales have been in preliminary 
use (5). 

The procedures for the derivation of the three scales are similar and 
can be presented generally for the three. It is essential to note that the 
details of scale development have involved many tentative trials with 
subsequent validating studies and finally the adoption of the best scale 
for inclusion in the Inventory. A description of this process would be 
too detailed for profitable publication. It follows, though, that the 
establishment of validity of a scale becomes most important, consequently 
the scale descriptions rest more than is usual upon the establishment of 
clinical validity. Clinical criteria for validity present many potential 
pitfalls. There must be reasonable assurance that the clinical opinion 
and the scale derivation process are separated for each new validating 
case; i.e clinical diagnoses must be based upon valid and generally 
applicable concepts and one must be assured that the diagnostic judg- 
ments are not determined on a knowledge of the item content of the scale 
to be validated. When all the work of derivation and validation of a 
scale is centered in one laboratory the dangers in establishing validity are 
avoided only with difficulty. The following scales are presented with a 
full realization that their final validity and general usefulness will depend 
upon the experience of others, though we have endeavored to avoid 
obvious pitfalls. 

* Supported in part by research grant from the Graduate School of the University 


of Minnesota and in part by Work Projects Administration Official Project No. 165-1-— 
71-124, Sub-Project No. 379. 


153 














154 J.C. McKinley, M.D., and S. R. Hathaway 


The problems to be solved by the scales of the Minnesota Multiphasic 
Personality Inventory are frankly those of detecting and evaluating typi- 
cal and commonly recognized forms of major psychological abnormality. 
The terminology and classification system are largely drawn from ordi- 
nary psychiatric practice. Where there are correlations between clinical 
syndromes, the scales tend to show correlation; where the clinically recog- 
nized diagnosis is impure the scales will tend to be impure. These are 
usually, therefore, not statistically pure scales. They often contain 
deliberately diverse types of items. One additional point should be 
especially stressed. Every item finally chosen differentiates between 
criterion and normal groups and that is the reason for acceptance or 
rejection of the items. They are not selected for their content or theo- 
retical import. Frequently the authors can see no possible rationale to 
an item in a given scale; it is nevertheless accepted if it appears to differ- 
entiate. Some scales have been selected and put into experimental use 
in our clinic before the items were studied for content. Occasionally, 
items tiat differentiate have been rejected to eliminate some undesirable 
statistical trend. Thus items from the depression scale tend naturally to 
recur in most other scales and must be omitted in part at least if the inter- 
correlation is not to be undesirably high. 

Specifically, the derivation of scales begins with the selection of a 
criterion group or groups. These persons have all been examined and 
diagnosed by the staff of the Department of Neuropsychiatry as patients 
in the in-patient service of the University Hospital. The size of the 
criterion group varies usually between 25 and 50. For some scales it 
required several years to collect a sufficient number of cases to permit 
satisfactory scale derivation. These criterion cases are selected to be as 
representative as possible of the classical concept of the given syndrome. 
In practice, as any thoughtful clinician will agree, clear and uncompli- 
cated cases of such common diagnostic classes as hysteria, mania and the 
like are rare. There is most frequently an admixture of symptoms of 
other syndromes and commonly it is not at all certain that another skilled 
interviewer would agree as to the main diagnosis. To be psychologically 
abnormal in one recognized way seems to greatly increase the appearance 
of other psychologically abnormal states. This is easily understood in 
the case of depression but the connection is more obscure for other syn- 
dromes. It is to be emphatically understood that these scales are recog- 
nized to partake of the same defects that are found in the classification 
system now in general use. It would doubtless be more pleasing to the 
theoretically minded person if an approach were adopted involving a new 
nosology based on experimentally determined categories. All that can 
be said in our defense is that as a matter of practicality in the clinical 


Minnesota Multiphasic Personality Inventory 155 


setting of today the criterion groups correspond to the types that are now 
being generally recognized and the scales are deliberately prepared to aid 
in the kind of diagnostic judgments we now understand. It is to be hoped 
that the future will see much improvement in classification and when that 
time comes, new scales and possibly new techniques will need to be de- 
veloped for better performance. 

For each scale the responses of the criterion group or groups to each of 
the 550 items of the Minnesota Multiphasic Personality Inventory were 
tabulated to show the percentage frequency of occurrence of each possible 
answer. . . . True—False—Cannot Say. These response frequencies 
were tabulated for comparison with expected frequencies as determined on 
normal groups. 

The normal groups most commonly used for item by item contrast 
were composed of 339 persons selected from among the general Minnesota 
normals and of 265 precollege cases from among high school graduates 
applying for admission to the University. The general sample was di- 
vided into 139 men and 200 women, tabulated separately to show sex 
differences. These persons were between the ages of 26 and 43 inclusive 
and were all married. They declared themselves to be not under a 
doctor’s care at the time of taking the inventory and are considered normal 
on that single basis. The modal years of schooling was 8 and few had 
gone beyond high school. These particular persons were used because 
they were felt most likely to be stable and representative. The tabulation 
for the entering college students was based upon 151 men and 114 women. 
These latter tabulations were invaluable to control the strong tendency 
of responses to certain items to vary widely in accordance with age or 
intelligence, or both. 

For all scales the percentages for the criterion groups were compared 
to each of the normal percentages and an initial reservoir of items was 
selected which included all those showing a consistent difference. Statis- 
tically no item was chosen that showed a difference less than twice its 
standard error and most items yielded differences greater than three times 
their standard errors. The steps in the selection of the final scale items 
were more variable and will be partly presented more completely with the 
descriptions of the separate scales. 

To establish the validity of the various scales as they were derived, 
their power to differentiate test cases from normals was used as an indi- 
cator. Test cases is the term used in this paper to designate cases identi- 
fied relatively or entirely independently of the criterion groups. For the 
most part, these cases were drawn from among hospitalized patients that 
were diagnosed routinely by the staff during the preliminary derivation 
of items and before any scale was made available. Where possible, test 











156 J.C. McKinley, M.D., and S. R. Hathaway 


cases were taken from records and diagnoses made in an entirely different 
clinical setting. Naturally these latter cases are most desirable and 
where they were not available in suitable numbers for these scales, it is to 
be hoped that other workers will supply the necessary final validation. 
At least one such study has already been published (6). 

It is important to note, nevertheless, that test cases were not so care- 
fully selected as the criterion cases to represent either the pure syndromes 
or careful evaluation by the staff. This was necessary because of the 
small percentage of good clinical cases among all those seen. It was 
assumed that the best scale was the one which would most effectively 
separate test cases from normals and from other types of abnormals. The 
chief criterion for excellence of separation was the amount of overlap of 
the groups. It was recognized that even a perfect scale could not com- 
pletely separate these test cases from normals since some of them were 
borderline and probably no worse than some of the “normal” group and 
some of them may have been incorrectly diagnosed clinically. Some also 
changed radically between the time of diagnostic summary and the time 
of testing. In considering the data presented showing the standard scores 
of test cases against the normal groups, it can usually be assumed that 
the data given represent a poorer picture than would be yielded if the 
cases could have been more carefully selected and the normals more ade- 
quately proved normal. 


II. The Hysteria Scale (Hy) 


A scale for aid in the clinical diagnosis of hysteria was one of the earliest 
problems undertaken in the development of the Minnesota Multiphasic 
Personality Inventory. Almost at once, a promising preliminary scale 
was developed and many hours were then directed towards its improve- 
ment. Although the original scale was bettered somewhat, most of the 
series of experimental hysteria scales were differentially less effective than 
the original and, it rapidly became apparent that our difficulty was due 
considerably to lack of definition in the clinical concept, to the concurrence 
of hysterical phenomena with other neurotic symptoms in the same indi- 
vidual, or to downright inability of the psychiatric staff to be sure of 
hysterical reactions in individuals who were under suspicion of developing 
organic disease. 

The persons comprising the criterion groups were drawn mainly from 
the inpatient service of the Psychiatric Unit of the University of Minne- 
sota Hospitals. They had each received the diagnosis, psychoneurosis, 
hysteria, or had been especially noted as having characteristic hysterical 
components in the personality disturbance. In the assignment of these 
diagnostic terms the neuropsychiatric staff followed, as closely as possible, 





Minnesota Multiphasic Personality Inventory 157 


current clinical practice. Where cases showed a simple conversion symp- 
tom such as aphonia, an occupational cramp or a neurologically irrational 
anesthetic area, the diagnosis was usually well agreed upon. In some 
cases, however, there would remain a doubt as to either a true organic 
illness such as multiple sclerosis or it would be difficult to distinguish the 
syndrome from hypochondriasis or an early schizophrenic reaction. 

Several tentative criterion groups were selected from these diagnosed 
cases. The final chief group was made up of 50 cases; the items finally 
selected were repeatedly identified, however, by the several criterion 
groups. The observed frequencies of true or false responses to all items 
were compared in percentages between criterion and normal groups; a 
basic pool of items was established. 

The items could immediately be seen to belong to several categories. 
There was a strong group referring to somatic complaints and another 
negatively correlated consisting of statements tending to show that the 
patient considered himself unusually well socialized. Examples of the 
somatic items were complaints of headache, spells of dizziness and tremor 
of the hands. The social items were well illustrated by his saying “‘false’’ 
to such items as “I frequently have to fight against showing that I am 
bashful,”’ “I get mad easily and get over it soon,’’ “Some people are so 
bossy that I feel like doing the opposite of what they request even though 
I know they are right.”” In spite of these items implying a very socialized 
makeup the items include “unhappy” and “blue” admissions. These 
latter items are to be contrasted with those in which the patient says that 
he is not repressed or shy with others. Besides the foregoing item types 
there were certain others that persistently appeared in the statistical 
studies but for which we have no adequate interpretation. It was at 
once apparent that the correlation of any final scale adopted would be 
rather high with that previously developed for hypochondriasis (now Hs, 
formerly H-Ch in the Manual). It seemed desirable to decrease this 
correlation by eliminating as many of the overlapping somatic items as 
possible. 

In order to test the results of various changes in item content of the 
several trial scales, test cases were accumulated. These cases were ob- 
tained from several sources. A number of newly diagnosed and therefore 
independent cases were available from the Neuropsychiatric Clinic. In 
addition, two separate small groups of records were obtained through the 
cooperation of Dr. Burton P. Grimes and Major Carleton Leverenz. The 
latter cases had been received in an army station hospital and diagnosed 
psychoneurosis, hysteria. 

Elimination of the somatic items resulted in a marked drop in the 
number of test cases identified and introduced another disturbing diffi- 








Number of cases 





















158 J.C. McKinley, M.D., and S. R. Hathaway 


culty; if only the non-somatic items were used, there was a strong relation 
to age and education. The mean score was more than a half sigma higher 
for the college group than for older persons. These results forced the 
inclusion of some somatic items in the final scale with consequent high 
correlation (r = .52 normals and r = .71 clinic cases) between Hs (hypo- 
chondriasis) and Hy (hysteria). Some relation still remains between age 
and intelligence and the Hy score. The relation seems valid clinically. 

One of the reasons why the compromise intercorrelation was forced 
can be illustrated by the behavior of the two scales on two test groups. 
The first test group was composed of 75 cases diagnosed hypochondriasis. 
























‘ t ' +4 
H, Test Cases on H i ae io oe 
Y y x 1 ' : re 8 8 a 
n i i i i 8 xoxSx 0 010 25 1 ‘28 88 853.88 8 5x 1° ao) 9 
H, Test Cases on H x ‘eolt ‘ 
Y . 18 8 me 
l n ia ae 0x 10x 18 818 8 MOXOXOKO1 x 1 0 4 ss 
1 
H, Test Cases on Hy . gt ®t oe. * 
ae I l O1 i oe) shod dl. BE m. x ake 5 x EEX 
H, Test Cases on H, Bie eS ae 
1 x sth tit 18 ; 
l i i i 1° x O xX *OOxOxXOxOxO x ° x 
' 
er te x Moles 
} o Females 
! 
H 
' 
i n ul l i + if 2 1 l 











10 2 30 35 40 45 50 55 600 65 7 % 8 8& 90 9% 100 105 10 
Standard score 


Fie. 1. Graphic presentation of the standing of test cases for Hy and Hs against 
the “normal” sample. For further comparison, the standing of the same cases on the 
Hs scale is shown. 


Of these, 13 per cent received a T score of 70 or above on Hs alone. (T 
scores are standard scores with the mean of normals adjusted to 50 and 
the standard deviation adjusted to 10. 70 represents plus two sigma.) 
In all, 76 per cent received a score of 70 or above on Hs alone or on both 
Hs and Hy. Finally, only 12 per cent had such a score on Hy alone. 
Contrast these data with the results on a test group of 60 cases diagnosed 
hysteria. Of these, 32 per cent received a T score of 70 or above on Hy 
alone. In all, 72 per cent received a T score of 70 or above on Hy alone 
or on both Hy and Hs. Finally only 7 per cent had such a score on Hs 
alone. Study of these figures shows among other things that if Hy were 
discarded because of correlation with Hs in favor of using only the latter, 
32 per cent of the Test group of hysteria cases would have been missed. 


Minnesota Multiphasic Personality Inventory 159 


Part of these facts are graphically illustrated in Figure 1. This shows the 
standing of the two test groups against the curve of normals for Hy. 
Figure 1 also shows the standing of the two test groups on Hs. It will 
be seen that Hy discriminates the hypochondriac as an abnormal as well 
as does Hs itself. It is possible from these results to omit scoring cases 
on Hs in situations where no clinical follow-up is intended. 

Although the above and other statistical points contributed to the 
continued use of both Hy and Hs, the most important determiner was 
clinical experience. All clinicians who used both scales were emphatic 
that there was indubitably a valid clinical difference between two persons 
having high scores on Hs and Hy but differing in that one score was 
higher. There was a different prognosis and treatment indicated for the 
two. Where Hs was higher the physical complaints were diffuse and 
frequently required much less study to establish the presence of an im- 
portant psychological factor in the disability. On the other hand, when 
Hy was dominant, the person frequently appeared normal psychologically 
and his physical complaints were likely to closely mimic or be accom- 
panied by some common physical syndrome of the type now called psy- 
chosomatic. The final decision will lie with other clinics. The scale is 
presented with the expectation that others will check these clinical and 
admittedly subjective impressions. 


The 60 items selected for the final Hy scale are as follows; each is 
followed by a T or an F to indicate the direction of the hysterical response. 


. During the past few years I have been well most of the time. 

. I am in just as good physical health as most of my friends. 

. I have never felt better in my life than I do now. 

. I do not tire quickly. 

. L have very few headaches. 

. Much of the time my head seems to hurt al! over. 

. Often I feel as if there were a tight band about my head. 

. I am troubled by attacks of nausea and vomiting. 

. I seldom or never have dizzy spells. 

. I have never had a fainting spell. 

. My eyesight is as good as it has been for years. 

. Lean dnd a long while without tiring my eyes. 

. I feel weak all over much of the time. 

. I have had no difficulty in keeping my balance in walking. 

. I have little or no trouble with my muscles twitching or jumping. 

. I frequently notice my hand shakes when I try to do something. 

. I have few or no pains. 

. My hands and feet are usually warm enough. 

. Once a week or oftener I feel suddenly hot all over, without apparent 
cause. 

. Iam almost never bothered by pains over the heart or in my chest. 

; oo ever notice my heart pounding and I am seldom short of 

reath. 
. There seems to be a lump in my throat much of the time. 
. I have a good appetite. 





160 J.C. McKinley, M.D., and S. R. Hathaway 


after about one year the value was only r = .47. 


. I wake up fresh and rested most mornings. 
. a sleep is fitful and disturbed. 
r 


ink an unusually large amount of water every day. 


. I believe that my home life is as pleasant as that of most people I know. 
. Lam about as able to work as I ever was. 
. I have often lost out on things because I couldn’t make up my mind 


soon enough. 


. It takes a lot of argument to convince most people of the truth. 

. I like to read newspaper articles on crime. . 

. L enjoy detective or mystery stories. 

. Iam worried about sex matters. 

. My conduct is largely controlled by the customs of those about me. 

. Iam always disgusted with the law when a criminal is freed through 


the arguments of a smart lawyer. 


. I think a great many people a their misfortunes in order to 


gain the sympathy and help of others. 
think most people would lie to get ahead. 


. Most people will use somewhat unfair means to gain profit or an ad- 


vantage rather than to lose it. 


. I feel that it is certainly best to keep my mouth shut when I’m in 


trouble. 


. Iam likely not to speak to people until they speak to me. 
. When in a group of people I have trouble thinking of the right things 


to talk about. 


. I find it hard to make talk when I meet new people. 

. It is safer to trust nobody. 

. I can be friendly with people who do things which I consider wrong. 
. I wish I were not so shy. 

. What others think of me does not bother me. 

. I frequently have to fight against showing that I am bashful. 

. I resent having anyone take me in so cleverly that I have had to admit 


that it was one on me. 


. Often I can’t understand why I have been so cross and grouchy. 

. My daily life is full of things that keep me interested. 

. Most of the time I feel blue. 

. Iam happy most of the time. 

. I have periods of such great restlessness that I cannot sit long in a 


chair. 


. I get mad easily and then get over it soon. 
. In walking I am very careful to step over widewalk cracks. 
. Some people are so bossy that I feel like doing the opposite of what 


they request, even though I know they are right. 


. [commonly wonder what hidden reason another person may have for 


doing something nice for me. 


. The sight ef blood neither frightens me nor makes me sick. 
59. 
60. 


I find it hard to keep my mind on a task or job. 
At times I feel like swearing. 


The raw score mean and standard deviation for 475 normal females 
were = 18.80, s = 5.67 and for 345 males they were Z = 16.50, s = 5.50. 
Test retest data from 47 cases with a three day to more than a year 
interval gave an r of only .57. On a group of 98 high school girls retested 


(Data provided by 


courtesy of Dora Capwell and the Minnesota State Bureau for Psycho- 








Minnesota Multiphasic Personality Inventory 161 


logical Services.) These low values also need explanation. Test retest 
values for other scales of a comparable number of items are above .70. 
Again the above clinical arguments must be resorted to. It has not been 
proved so by other objective tests but clinically observed exacerbations 
and recessions of the symptomatic picture of hysteria in a given case are 
marked. An apparently normal person placed under sufficient strain will 
surprise everyone by developing symptoms. A case with a clear paralysis 
may get well momentarily and be undetectable except on the basis of the 
history. 

Assuming the validity of the scale, the implications in routine testing 
of the foregoing discussion are interesting. If at the time of testing, the 
subject is under strain and experiencing symptomatic evidence of hysteri- 
cal conversions, the scale identifies him. If he is always on the borderline 
he is probably identified but if he is not under strain at the time he may 
not show the potentiality. If may be that similar thinking could explain 
the observed clinical fact that some cases of uncomplicated and obvious 
hysterical conversion are not identified by this scale or by any that could 
be derived in the present studies. 

It is pertinent to introject that the statistical thinking derived from 
aptitude and achievement testing should be amended when personality 
tests are considered. Many traits of personality are highly variable. 
Otherwise there would be little meaning to psychotherapy or preventive 
mental hygiene. Test retest data on Multiphasic scales are more a meas- 
ure of trait variance than of reliability of scales. In some cases scales 
correlate consistently much higher with other scales than with themselves. 
This will need future expanded interpretation but it at once indicates 
that several factors of personality commonly vary together. Again com- 
mon observation recognizes these variations as they are seen in those 
about us. 

Table 1 gives the intercorrelations of Hy with all other scales as 
obtained from random normal and clinic records. 


Table 1 


Correlations of Hy with other scales * 





100 Normal Group Cases 100 Psychopathic Hospital Cases 
Hs D Pd(rev) Pt Pa Ma _ Sc Hs D Pd(rev) Pt Pa Ma Se 
52 65 37 .18 44 8 .2 71 68 .18 338 40 -—.13 .23 











* The symbols used in this table and elsewhere in this paper refer as follows: Hs, 
hypochondriasis: D, depression; Pd(rev), psychopathic deviate (revised); f . psychas- 
thenia; Pa, paranoia; Ma, mania; Sc, schizophrenia. 








162 J.C. McKinley, M.D., and 8S. R. Hathaway 

































The higher correlations with Hs and D are apparent. The rise of 
these for the abnormal group indicates the dynamic factor alluded to 
above. In clinical practice the three scales constitute a kind of ‘‘neurotic 
trio” that characterizes the greater number of the cases observed. 

In summary, a scale called Hy for aid in identification of hysterical 
tendency has been derived. This scale appears to measure a rather 
variable trait which is closely allied to and likely includes the earlier scale 
of hypochondriasis. The person who is especially characterized by Hy 
tends to be less obviously neurotic and to have during disabled periods, 
a more specific set of physical symptoms. 


III. The Hypomania Scale (Ma) 


Hypomania refers to the milder degrees of manic excitement occurring 
typically in the manic depressive psychoses. The cardinal symptoms of 
maniacal conditions are generally stated to be an elated but unstable 
mood, psychomotor excitement and flight of ideas. Hypomanic trends 
follow the same pattern in general, but in lesser degrees that may be at 
times so unobtrusive as not to impress even an expert. Thus, among 
normal individuals one may recall acquaintances who tend at times to be 
overtalkative, distractible, restless. Such a person may feel and appear 
to be extraordinarily well, enthusiastic and energetic, but the use of his 
energy is likely to be inefficient because he tries to do too many things at a 
time. He is usually full of ideas which may be basically sound but they 
are not adequately worked out and if put into execution are seldom carried 
through to a satisfactory conclusion. Emotionally he may be a bit elated 
and too happy, he may be impatient and irascible or he may express ideas 
of feeling gloomy and somewhat frustrated; commonly the mood swings 
rapidly within minutes or hours from one to another of these attitudes, 
often without any corresponding environmental explanation for the shifts. 
Viewed over a longer period of time it is often discernible that these 
persons tend to have periods of definite depression rather than elation 
or euphoria. Along with these characteristics, there is often egocentri- 
city, lack of appreciation of the ineptitude of his behavior in given settings 
and a certain obvious disregard for others. In many respects these 
patients, during their episodes, are reminiscent of the asocial type of 
psychopathic personality. In some of the cases their abnormal character- 
istics disappear completely between attacks. , 

A group of 24 such cases was selected for scale construction. These 
criterion cases had all been studied intensively as inpatients in the Psycho- 
pathic Unit of the University Hospital. Only manic patients of moderate 
or light degree were usable, since the more severe cases could not cooperate 
adequately in sorting the inventory items. The clinical diagnoses were 





Minnesota Multiphasic Personality Inventory 163 


either hypomania or mild acute mania, depending on the severity of the 
case. Care was exercised to exclude individuals with delirium, confusional 
states, or with excitements associated with other psychoses such as 
schizophrenia; the agitated depressions were likewise excluded. Natu- 
rally, routine but searching medical, neurologic, psychiatric and psycho- 
logical studies were performed on these patients as indicated. The num- 
ber of cases is obviously too small to permit an analysis of the effects of 
factors like sex, age, marital status, and economic level, but as a criterion 
group they are satisfactorily uniform for scale construction purposes. 

The selection of the differential items was done by essentially the same 
methods as for previously reported scales. The percentage frequency 
of significant responses was obtained on the normal and criteiron groups 
of persons for each of the 550 separate items of the inventory. Several 
scales were tentatively constructed and the following 46 items were finally 
selected as representing the best scale for hypomania. The hypomanic 
response is indicated for each item according to the answer being True 
(T) or False (F). As in all scales each item has been assigned a value of 
“one” in obtaining the raw score. 


1. I have had periods in which I carried on activities without knowing 
later what I had been doing. (T) 
2. I have had attacks in which I could not control my movements of 
speech but in which I knew what was going on around me. 
3. I have had blank spells in which my activities were interrupted and I 
did not know what was going on around me. 
4. At times I have fits of laughing and crying that I cannot control. 
5. My speech is the same as always (not faster or slower, or slurring; 
no hoarseness). 
6. I sweat very easily even on cool days. 
7. A person should try to understand his dreams and be guided by or take 
warning from them. 
8. I drink an unusually large amount of water every day. 
9. My people treat me more like a child than a grown-up. 
10. Some of my family have habits that bother iad annoy me very much. 
11. At times I have very much wanted to leave home. 
12. I have often had to take orders from someone who did not know as 
much as I did. 
13. It makes me impatient to have people ask my advice or otherwise 
interrupt me when I am working on something important. 
14. I have at times stood in the way of people who were trying to do 
something, not because it amounted to much but because of the princi- 
le of the thing. 
15. F believe women ought to have as much sexual freedom as men. 
16. I have been inspired to a program of life based on duty which I have 
since carefully followed. 
17. I feel that I have often been punished without cause. 
18. When I was a child I belonged to a crowd or gang that tried to stick 
together through thick and thin. 
19. I have never done anything dangerous for the thrill of it. 
20. I am always disgu with the law when a criminal is freed through 
the arguments of a smart lawyer. 











164 J.C. McKinley, M.D., and S. R. Hathaway 


21. I do not blame a person for taking advantage of someone who lays 


himself open to it. (T) 
22. If several people find themselves in trouble, the best thing for them to 

do is to agree upon a story and stick to it. (T) 
23. I mov blame anyone for trying to grab everything he can get in this - 

world. (T) 
24. At times I have been so entertained by the cleverness of a crook that I 

have hoped he would get by with it. (T) 
25. It wouldn’t make me nervous if any members of my family got into 

trouble with the law. (T) 
26. When I get bored I like to stir up some excitement. (T) 
27. When in a group of people I have trouble thinking of the right things 

to talk about. (F) 
28. I find it hard to make talk when I meet new people. (F) 
29. I never worry about my looks. (T) 
30. It makes me uncomfortable to put on a stunt at a party even when 

others are doing the same sort of things. (F) 
31. It is not hard for me to ask help from my friends even though I cannot 

return the favor. (T) 
32. Something exciting will almost always pull me out of it when I am 

feeling low. (T) 
33. I have met problems so full of possibilities that I have been unable to 

make up my mind about them. (T) 
34. Once a ok or oftener I become very excited. (T) 
35. sg periods of such great restlessness that I cannot sit long in a (7) 

chair. 


36. At times I feel that I can make up my mind with unusually great ease, (T) 

37. = times my thoughts have raced ahead faster than I could speak (1) 
them. 

38. I sometimes keep on at a thing until others lose their patience with me. (T) 

39. At times I have a strong urge to do something harmful or shocking. (T) 

40. Some people are so bossy that I feel like doing the opposite of what 


they request, even though I know they are right. (T) 
41. I am an important person. (T) 
42. I know who is responsible for most of my troubles. (T) 
43. I am afraid when I look down from a high place. (F) 
44. I work under a great deal of tension. (T) 
45. Sometimes when I am not feeling well I am cross. (F) 
46. My table manners are not quite as good at home as when I am out in 

company. (F) 


Some of these items are obviously enough applicable to the usual 
concept of hypomania, but others are not explicable at present. The 
raw score mean and standard deviation from 379 females were = 13.65, 
s = 4.50 and the values from 294 normal males were = 14.51, s = 4.42. 

Elated, overactive or clearly hypomanic cases rarely occur among the 
neuropsychiatric clinic cases available to test the Mascale. From among 
nearly a thousand clinic cases only 38 valid records are available on per- 
sons marked by the staff as having some overactivity or elation. These 
exclude of course the criterion cases. Of the 38 only 5 were diagnosed 
manic-depressive psychosis. They received scores of 77; 75; 79; 66 and 
61; the latter two were marked, “‘mild hypomanic.”” The remainder of 
the whole group received various diagnoses, chiefly some form of schizo- 








Minnesota Multiphasic Personality Inventory 165 


phrenia. There was evidence that hypomanic cases are more difficult 
for the staff to diagnose than are others. As might be expected from 
clinical experience, there were a number of cases called psychopathic 
personality. 

Among more than nine hundred available clinic cases 30 received 
scores of 70 or more without any clinical note especially indicating hypo- 
mania. These cases also illustrate the tendency for psychopathic per- 
sonality to be indicated by the hypomanic scale since 10 of them received 
this diagnosis or were chronic alcoholic cases. There also appeared to be 
a tendency for cases with organic deterioration of the brain to receive 
high scores. 


M, Test Cases 


OOK oo 
joxx 

l0Ox 

Ox 

ox 

lo 

j000x 


x 
i j 1x O1 


Mg Criterion Cases 





tl 
Ox 
Ox 


O1x Oo 4 x J 





x Males 
o Females 


exe Normal 
oem Psychiatric 


mn am = = JOOOXKK KK new 
2 


@ eecens 


ey oe eee 
wo 
oS 
= 


B 
8 
S 
t 
© 
2 
E 
= 
= 


ini atacn ites as tyes otk wo 





Li 


20 25 30 35 40 45 50 55 60 65 TO 75 80 8 90 95 
Standard score 
Fic. 2. Graphic presentation of the results with the final Ma scale showing the 


general “normals,” a sample of miscellaneous clinic patients and the criterion and test 
cases. 





Figure 2 shows graphically the distributions of the scores of the pa- 
tients with overactivity or elation (the criterion cases), 300 randomly 
selected psychiatric clinic cases and the whole normal group. In several 
of the criterion patients with low scores there is a high probability that 
the manic state has been superseded by a normal or depressed phase at the 
time of testing. The summary of symptoms made by the staff was not 
correlated in time with the administration of the inventory so that if a 
patient alternated between manic and depressive or normal phases, his 
state at the particular time of testing cannot now be determined. The 














166 J.C. McKinley, M.D., and S. R. Hathaway 


evidence for the validity of Ma is certainly not conclusive. There is, 
however, a tendency for persons with hypomanic symptoms to secure high 
scores. It is to be hoped that the scale would appear discinctly better if 
the criterion cases were better. This is one of several scales that will 
need to be checked further before final acceptance. Table 2 gives the 
correlations of Ma with other scales as observed on normal and clinic 
cases. 
Table 2 
Correlations of Ma with other scales 





100 Normal Group Cases 100 Psychiatric Hospital Cases 
Hs D Hy Pd Pa Pt Se M4Hs D Hy Pd Pa Pt Se 
28 -02 05 49 30 39 #56 08 -—21 -—.13 43 31 14 36 











The correlation with D is slightly negative as might be expected. In 
clinical practice it is common to find both depression and manic over- 
activity in the same patient at the same time as is the case with some of the 
agitated depression patients. This was seen frequently on the test pro- 
files; it probably explains the low correlation. As was indicated in the 
validity study above there is a degree of positive relationship between 
Ma and Pd. 

The test-retest correlation for Ma is .83. This indicates that the trait 
has a surprising degree of stability in normal persons. No test retest is 
available on clinic cases. There is probably an important constant per- 
sonality factor represented together with a variable factor. The constant 
factor is likely to be something akin to what is commonly called optimism. 
Among our acquaintances, those whom we think of as optimists are rather 
consistently so, as are the pessimists. Apart from optimism there is also 
a variable tendency related to the usually episodic excitement of mania 
or hypomania which are seen in abnormal degree. The abnormal factor 
comes and goes and seems not to be strong among normal persons. Fur- 
ther analysis is needed to develop these theories. 

The Ma scale has proved to be quite useful in the clinic. The juvenile 
delinquent, the over active adult and the agitated depression with ambi- 
valent affect are not frequent but nevertheless important to recognize. 
The delinquent with a high Ma score and lower Pd has seemed more 
likely to benefit by counseling and being given another chance. The 
rather good prognostic indications in the adult case with an isolated Ma 
score are apparently in accord with general psychiatric opinion. 

In spite of the small number of criterion and test cases available, a 
scale for hypomania is presented. It is the best that we could derive 
from the patients seen over a five year period. The scale is certainly valid 








Minnesota Multiphasic Personality Inventory 167 


as to trend and it has proven distinctly useful in the clinic. The correla- 
tions with other scales are low. 

At least two factors are apparently measured. These are dominant 
constant factors allied to ebullient optimism and a more variable factor 
that accounts for abnormal periods. 


IV. The Scale for Psychopathic Deviate (Pd) 


It is not our intention in this paper to add to the already long list of 
definitions of the general clinical group ‘‘Psychopathic personality.” 
Our study has accepted, as a basic group, those persons seen by us who fit 
approximately into the asocial type of psychopathic personality as de- 
scribed by Henderson (7), Cleckley (8), and others. 

Among these psychopathic personalities it was early recognized that 
there was an important subgroup, composed of individuals who were 
probably identifiable by a questionnaire. A preliminary study (9) indi- 
cated that these persons were partly characterized by a tendency to 
answer in ultra-perfect ways, as shown by such general scales as the 
B:iN component on the Bernreuter Personality Inventory. Subsequent 
to this earlier work five trial scales for the identification of these persons 
have been developed. The best of these is now referred to as Pd (revised). 

The chief criterion group consisted of patients diagnosed psycho- 
pathic personality, asocial and amoral type, after study by the staff of 
the Department of Neuropsychiatry. They were from both sexes and 
were mostly within the approximate age range 17 to 22 years. None 
was psychotic or neurotic, and most of the hysterical and clearly schizo- 
phrenic cases were eliminated. 

The symptomatic backgrounds of the criterion cases were highly varied 
but can be characterized in several ways. Most often the complaint was 
stealing, lying, truancy, sexual promiscuity, alcoholic overindulgence, 
forgery and similar delinquencies. There were no major criminal types. 
Most of the behavior was of the commonly described poorly motivated 
and poorly concealed sort. All of the criterion cases had long histories of 
minor delinquency. Although many of them came from broken homes 
or otherwise disturbed social backgrounds, there were many in whom such 
factors could not be seen as particularly present. Among the criterion 
cases there was a somewhat larger proportion of girls than of boys; this 
may have been due to the social selection that results from differential 
treatment by courts of boy and girl delinquents. This factor, if it 
operated, could account for the larger number of girls since many of the 
cases came for study on request of the courts. 

Response frequencies to items as observed on the criterion group were 
compared to similar response frequencies observed on a sample of the 











168 J.C. McKinley, M.D., and S. R. Hathaway 


married Minnesota population and the sample of college applicants. A 
number of items were then selected. This tentative list included many 
items later discarded. These items were studied further as individual 
items and more extensively as they fell into subgroups. Examples of 
such subgroups are items related to home difficulty such as, ‘‘My parents 
and family find more fault with me than they should,’’ and social trouble 
items such as, “I played hookey from school quite often as a youngster.”’ 
From numerous minor studies a preliminary scale was derived. The 
scale was immediately valuable in the clinic. The clinical demand was 
dependent in part upon the uncertainty of the average clinician when he 
attempts to examine a case of suspected psychopathic personality. 

Two groups of test cases were available. One group was composed of 
patients in the Psychopathic Unit of the University of Minnesota Hospi- 
tals who had been studied subsequent to the selection of the criterion 
cases. These were not so carefully checked to eliminate doubtful cases 
since it was assumed that the group as a whole should show the desired 
tendency. For the other test group, we were fortunate in obtaining 
records from 100 men prisoners at a federal reformatory. These cases 
were collected by H. D. Remple, psychologist, with the cooperation of the 
medical staff and released to us for study through the courtesy of Dr. John 
W. Cronin and the United States Public Health Service. All the re- 
formatory cases had received a psychiatric diagnosis of psychopathic 
personality. It is important to note, however, that they were not dif- 
ferentiated as to type and could not be expected to be uniformly of the 
asocial type although their presence in a reformatory would indicate that 
the majority might be so. 

Using the test groups to try the excellence of various combinations of 
items (but these groups were not used to select individual items) 50 items 
were eventually chosen as the final scale. These items make up the Pd 
(revised) scale. The following list gives the final items together with a T 
or an F to indicate the abnormal answer. 


1. I am neither gaining nor losing weight. (F) 
2. I have used alcohol excessively. (T) 
3. My family does not like the work I have chosen (or the work I intend 
to choose for my life work). (T) 
4. I believe that my home life is as pleasant as that of most people I 
know. (F) 
5. There is very little love and companionship in my family as compared 
to other homes. (T) 
6. I have been quite independent and free from family rule. (F) 
A = Venema have often objected to the kind of people I went around (7) 
with. 
8. My parents and family find more fault with me than they should. (T) 
9. I have very few quarrels with members of my family. (F) 
10. At times I have very much wanted to leave home. (T) 








Minnesota Multiphasic Personality Inventory 


. My relatives are nearly all in sympathy with me. 

. I have been disappointed in love. 

. I liked school. 

. My sex life is satisfactory. 

. I like to talk about sex. 

. In school I was sometimes sent to the principal for cutting up. 

. During one period when I was a youngster I engaged in petty ivvery. 

. My conduct is largely controlled by the customs of those about me. 

. Iam always disgusted with the law when a criminal is freed through 
the arguments of a smart lawyer. 

. I am against giving money to beggars. 

. I have never been in aro Ba with the law. 

. I have never been in trouble because of my sex behavior. 

. No one seems to understand me. 

. When in a group of people I have trouble thinking of the right things 
to talk about. 

. I find it hard to make talk when I meet new people. 

. I do not mind being made fun of. 

. Ido many things which I regret afterwards (I regret things more or 
more often than others seem to). 

. I wish I were not so shy. 

. What others think of me does not bother me. 

. It makes me uncomfortable to put on a stunt at a party even when 
others are doing the same sort of things. 

. I wish I could be as happy as others seem to be. 

. My daily life is full of things that keep me interested. 

. Much of the time I feel as if I have done something wrong or evil. 

. I have not lived the right kind of life. 

. Lam happy most of the time. 

. I have periods in which I feel unusually cheerful without any special 
reason. 

. Sometimes without any reason or even when things are going wrong I 
feel excitedly happy, ‘‘on top of the world.” 

’ = times my thoughts have raced ahead faster than I could speak 
them. 

. If people had not had it in for me I would have been much more suc- 
cessful. 

. Someone has it in for me. 

. I am sure I get a raw deal from life. 

. I am sure I am being talked about. 

. I have had very peculiar and strange experiences. 

. I know who is responsible for most of my troubles. 

. L have very few fears compared to my friends. 

. These days I find it hard not to give up hope of amounting to some- 


thin 
: og battles are with myself. 
‘ . om easily downed in an argument. 
. I find it hard to keep my mind on a task or job. 
. My way of doing things is apt to be misunderstood by others. 


Inspection of the final scale items will show them to fall naturally into 
several general groupings. For one, social maladjustment items are 
prominent. Another group is made up of items related to depression and 
the absence of strongly pleasant experiences. There are also a number of 
items suggesting paranoid trends. All of these subgroupings were found 











170 J. C. McKinley, M.D., and 8. R. Hathaway 


to contribute to validity. The composite scale weights of the groups as 
expressed in the number and occurrence frequencies of their items are 
apparently nearly optimal in the final scale. It was difficult to account 
for or predict variations in validity that were observed among the earlier 
scales; this indicates that the diagnosis is based upon a complex of factors 
rather than upon any one. Also, unlike the scale for psychasthenia, the 
items do not show a strong tendency to be highly intercorrelated. The 
final scale is, therefore, certainly not pure but deliberately mixed in factor 
content to yield greater clinical usefulness. 


















Py Clinic Test Cases ! i 
x 7 o. = x xox 18 x 8 a4 x 
i 1x 1 6 1 Foxe Xx 6 & 3 BE 88x 8 no SE ERE noi 
Federal Reformatory Cases eS 
‘ ” 7 7 
1, of Ep ge bie 
7 x x x xx xxtx x = 
' = xx » 'x xx xxix xx x x 
a xx xix MK MK YK KK x x x 
1 1 i I px bh Xx Xx x1k Ra KIX XK is ye xt l Beanie 
¢ eee 
g "sa : x Males 
oy Ro Ie Females 
5 10% 5% 1% © POMS 
s cou f 
' ' 
e Normals 
Ss _— 
> " l l h i a il n i i L | ae 











20 25 30 35 40 45 50 55 G0 65 70 7 80 85 9 % 100 105 II0 
Standard score 


Fic. 3. Graphic presentation of the standings of the test group for Pd and of 
Federal Reformatory cases as against the curve for the general ‘‘normal’’ group. 


Figure 3 shows graphically the standing on the final scale of 294 males 
and 397 females of the general norm group. As is common with fre- 
quency curves from personality traits having one end recognized as ab- 
normal, there is a slight negative skewness of the curve. About 4.6 per 
cent of these normal cases fall above 70 (two sigma above the mean). In 
these normative data there is no significant mean score change with in- 
creasing age, but it is probably significant that 56 per cent of the cases 
with standard scores above 70 are from the 16 to 25 age range; these 
young people make up only 33 per cent of the total norm group. The 
raw score sex difference amounts to 0.45 of a point between means and 
0.23 between standard deviations. These differences are probably iasig- 
nificant but a correction in standard score is made for sex since the 
observed values are still the most likely correct. The means and stand- 
ard deviations for raw scores are Z = 13.44; s = 4.23 for 397 females and 
& = 12.99; s = 4.00 for 294 males. 








OeooaT''ccrwv = 





Minnesota Multiphasic Personality Inventory 171 


Figure 3 also shows the standard score standings of the two groups of 
test cases. There are 78 cases diagnosed psychopathic personality from 
the Psychopathic Unit and 100 prisoners from the reformatory. Again 
it must be emphasized that the scale was expected to identify only the 
asocial fraction of these miscellaneous cases. The separation of the prison 
group is better than that of the clinic group, chiefly because of the selec- 
tive factor determining a greater frequency of the asocial type among the 
prisoners. Among the prisoners 59 per cent obtained scores at or above 
70 and among the clinic cases 45 per cent of the scores were above that 
level. The mean raw scores and the standard deviations are = 22.61; 
s = 4.43 for prisoners and = 21.44; s = 6.23 for clinic cases. 

The validity of the scale appears still better if the whole profile of 
each test case is inspected. The profile is made up of all scores at present 
obtained from the Multiphasic Inventory. When the other personality 
components are graphically presented in the same standard scores, a 
number of the test cases scoring below 70 on the present scale show up as 
plainly outstanding. Thus, it is common for scores on other scales to be 
uniformly from one-quarter to one-half standard deviation distance below 
the mean, leaving the Pd score clearly dominant. It is possible that this 
effect which appears to be a general reduction in the measured abnormality 
is produced by overly scrupulous conscious avoidance of any betrayal 
by abnormal answers on the part of the subject. More likely these 
persons simply feel themselves to be overly perfect. Evidence for the 
latter suggestion lies in the fact that they seem clinically to be character- 
ized by great self-esteem and self-interest. 

Inspection of the items in the scale shows that there is a group of items 
on which the significant answer shows this over-perfect tendency. Con- 
sider, for example, the item, “I find it hard to talk when I meet new peo- 
ple.” Most of the normal group admit such failings but the psychopath 
has no such reaction. It is interesting that these persons who might be 
most likely to attempt consciously to hide their character are partly 
identifiable by the attempt itself. It is, however, unlikely that such 
reactions are conscious; they are more likely completely submerged from 
the conscious level by the insightless egocentricity. 

The test-retest correlation obtained for the scale is 0.71; this was 
obtained on a normal sample of 47 cases repeated with an interval of a 
few days to more than a year. The correlations with other scales com- 
monly used on the Multiphasic Inventory are given in Table 3. It is 
interesting that, contrary to what is true with other Multiphasic scales, 
the correlations are smaller as obtained from the Psychopathic Unit cases. 

The random cases from the Psychopathic Unit included a few psycho- 
pathic personality diagnoses; they were predominantly neurotic and schiz- 














172 J.C. McKinley, M.D., and S. R. Hathaway 


Table 3 
Correlations of Pd (revised) with other scales 








100 Normal Group Cases 100 Psychopathic Unit Cases 
mm DD Be Re Re Me B&B hh DD Ba BM RM Ma Be 
=a aa: a £2 2 ae. £2: mle lu el leeelCUreOSlULlUR 











ophrenic. Although the increase is slight, the correlations with paranoia 
and mania are somewhat higher than with other scales. This is in accord 
with clinical experience. The hypomanic patient often gets into trouble 
during one of his attacks in ways that are confusingly similar to the be- 
havior of the psychopathic personality. Similarly, the case with a psy- 
chopathic personality is frequently somewhat paranoid. Being basically 
confident of his abilities, he naturally often feels persecuted by society 
when he is punished for behavior he thinks he will be able to control in 
the future. 

The introductory sentence to Part IV of this paper stated that we did 
not wish to add to the list of definitive statements about psychopathic 
personality. Yet it seems apparent from the foregoing facts, that an 


. appreciable percentage of clinically recognized cases are identified on the 


Pd (revised) scale. Generally speaking, these persons are those diag- 
nostically classified in most clinics as psychopathic personality, asocial 
type. Nevertheless, to avoid confusion we have named the scale Psycho- 
pathic Deviate (Pd). The term implies a variation in the direction of 
psychopathy. The scale itself is a definitive device and the following 
descriptive material is merely an attempt to state some general facts about 
cases selected by the scale. 

Most prominently the typical case has a shallow emotional life. The 
clinician may work very hard and become intensely interested in the 
patient but fail to receive in return more than a transitory and superficial 
loyalty. Sexual and other appetitive drives are not deeply effective in 
the patient’s life. For example, although there may be promiscuity or 
actual prostitution, the female is frequently frigid and engages in sexual 
acts more as a means to social entertainment. Females are more often 
masculine in interests. The psychopathic deviate seems to the observer 
to seek more and more dangerous or embarrassing experiences attempting 
to feel emotion like that of the normal. They sometimes commit suicide 
or more often nearly doso. This is again from shallow emotional sources 
rather than deep depression or normal recognition of failure. 

As they become older it is common for many of these cases to more 
successfully avoid real conflict with society. The lying, alcoholism, 











=i ol ' 


yer 


ng 
ide 


ore 





Minnesota Multiphasic Personality Inventory 173 


sexual promiscuity or other behavior may persist; but it is somewhat more 
restrained and also society seems to feel less outraged. While these per- 
sons can usually verbalize as to the consequences of their behavior, there 
is often a failure to appreciate its significance for them in terms of their 
long time social adjustment. Depression, when present, is usually ex- 
pressed as fear, of immediate punishment and loss of liberty rather than 
any reaction in guilt, regret or the like. The tendency to blame others or 
to excuse themselves for their predicament is common. They claim in 
self-extenuation that they were misled by others who took advantage of 
their innocence, the family discipline had been too severe so they rebelled, 
or some similar explanation. 

In clinical practice, the Pd scale has been most valuable. So many 
of the cases with high scores are recidivists in delinquency that it is 
helpful to be put on guard. If the person is 16 to 19 years of age and has 
a score twenty T points above most other scores on the profile, there is 
little likelihood that the person can stay out of trouble if not under rigid 
discipline. Older persons, however, more often avoid open breaks. In 
therapy, young persons with a high Pd should not be pushed toward 
maximal scholastic or vocational levels even when they have the capacities 
for training. 

One special advantage in the prediction afforded by the scale is that 
the type identified is so often characterized by a relatively appealing 
personality together with good intelligence and background. These 
factors are misleading to clinicians so that a halo effect operates toward 
a too lenient view of the clinical problem. The overly optimistic treat- 
ment is not only wasteful of social resources but also permits the fixation 
of undesirable habits in the patient; furthermore, the patient is permitted 
to continue until a more serious offense requires penal action. 

In summary, a final scale has been developed which will identify half 
or somewhat more of the cases routinely classed psychopathic personality 
clinically. The cases best identified are those with strong asocial trends. 
The scale is called psychopathic deviate to indicate that it is not expected 
to differentiate all the cases of psychopathic personality. The scale 
appears to have fair reliability and intercorrelations with other scales are 
low. 

It is in the clinic, however, that the value of the scale is better illus- 
trated. The clinician usually finds himself at a loss in the diagnostic 
evaluation of the psychopathic personality, and the scale has been found 
to be particularly useful in this regard. 


Received February 7, 1944. 






























174 J.C. McKinley, M.D., and 8. R. Hathaway 


References 


1. Hathaway, 8. R., and McKinley, J. C. A multiphasic personality schedule: I. 
Construction of the schedule. J. Psychol., 1940, 10, 249-254. 
2. McKinley, J. C., and Hathaway, 8. R. A multiphasic personality schedule: II. A 
differential study of hypochondriasis. J. Psychol., 1940, 10, 255-268. 
3. Hathaway, 8. R., and McKinley, J.C. A multiphasic personality schedule: III. The 
measurement of symptomatic depression. J. Psychol., 1942, 14, 73-84. 
4. McKinley, J. C., and Hathaway, 8. R. A multiphasic personality schedule: IV. 
Psychasthenia. J. appl. Psychol., 1942, 26, 614-624. 
5. Hathaway, 8S. R., and McKinley, J.C. Manual for the Minnesota multiphasic per- 
sonality inventory. Minneapolis: University of Minnesota Press, 1943. 
6. Leverenz, MajorC.W. Minnesota multiphasic personality inventory: An evaluation 
of its usefulness in the psychiatric service of a station hospital. War Med. 1943, 
4, 618-629. 
. Henderson, D. K. Psychopathic states. New York: W. W. Norton, 1939. 
. Cleckley, Hervey: The mask of sanity. St. Louis: C. V. Mosby Co., 1941. 
. Hathaway, 8. R.: The personality inventory as an aid in the diagnosis of psychopathic 
inferiors. J. consult. Psychol., 1939, 3, 112-117. 


oon 


A Note on the Clinical Use of the Hunt-Minnesota Test 
for Organic Brain Damage 


Howard F. Hunt 
University of Minnesota 


The Hunt-Minnesota Test for Organic Brain Damage (1, 2) has been 
used routinely in the Neuropsychiatric Clinic of the University of Min- 
nesota Hospitals for approximately nine months. This note summarizes 
the clinical experience with the test since its completion. It has been 
administered to 68 neuropsychiatric patients referred for psychologic 
examination for organic brain damage.* The great majority of these 
cases were diagnostic problems without clinically obvious deterioration 
referred because of suggestive case histories or equivocal neurologic signs. 
A minority of the referrals involved confirmation or evaluation of the 
extent of the clinically observed deterioration. In many of the cases, 
the examiner was unaware of the neurologic status of the patient. 

Performance on the test is expressed in terms of a T score; presumably 
T scores over 66 indicate organic brain damage. When the test score 
of each patient is compared with his final clinical evaluation, in 7 (10.3%) 
of the cases the score and the clinical evaluation disagree, in 54 cases 
(79.4%) the two agree, and in 7 (10.3%) cases the test results were invalid. 
Of the seven cases of disagreement, two were diagnosed early multiple 
sclerosis (T = 50 and 64) while one case each carried the following diag- 
noses: early meningo-vascular neurosyphilis (T = 50), petit mal epilepsy 
(T = 60), spastic paraplegia, possibly Little’s Disease (T = 63), undiag- 
nosed neurologic pathology (T = 60), and one case that had suffered 
cerebral concussion three years previously (T = 62). In three of the 54 
cases where a diagnosis of deterioration was agreed upon, the clinical 
evaluation was based on psychiatric observation alone and not on neuro- 
logic signs of pathology or a clearly significant history. In the remaining 
7 cases, the test results were invalid because of inadequate vocabulary 
level, poor cooperation, and the like. 

The short (15 minute) form of the test was used in 36 of the cases. 
This form has been found to be as adequate as the long form for coopera- 
tive subjects, but its use is definitely contra-indicated for borderline or 
uncooperative patients. 


*Mr. Paul E. Meehl, Clinical Assistant in Psychology, University of Minnesota 
Hospitals, tested 25 of these cases. 


175 











176 Howard F. Hunt 





The above evidence indicates the validity or the efficiency of the test 
as used in the clinic at the University of Minnesota Hospitals. The 
following paragraphs represent a series of miscellaneous, clinically valu- 
able observations evolved from its routine use. 

The degree of deterioration associated with any given T score magni- 
tude will depend upon the age and vocabulary status of the patient; the 
T score is not an accurate index of degree of deterioration. The test 
should be considered as a sorting or ranking device yielding probability 
estimates rather than precise measurements of the degree of deterioration. 
Brain damaged persons usually obtain T scores above 66, while most 
clearly normal persons obtain scores below 60; T scores above 60 probably 
justify suspicion of pathology. For practical purposes, a T score of 66 
(not 68 as suggested in the manual) should be considered as the “critical 
score’’ dividing normal from abnormal performances. However, a slavish 
adherence to a “critical score’”’ is not recommended. 

Weighted scores are provided in the norm tables in the manual for 
ages between 16 and 70 and for vocabulary levels down to 7 words. A 
few calculations will demonstrate to the clinician, however, that it is 
arithmetically impossible for older persons with very low vocabulary 
levels to obtain high T scores even though they fail all of the learning 
tests. Table 1 gives the minimum vocabulary level at various ages for 
which T scores of 70-71 can possibly be obtained. Persons at or above 
these various ages cannot be tested validly unless their vocabulary exceeds 
the minimum level for their age. 











Table 1 
Minimum Vocabulary Scores for Various Chronological Ages 
Vocabulary Levels 
Age Short Form Long Form 

50 years 9 words 

55 years 11 words 8 words 
60 years 13 words 10 words 
65 years 15 words 12 words 
70 years 17 words 14 words 





Young persons with high vocabulary scores can be tested validly, 
though in general the test is maximally efficient only within the age range 
of 20 to 55 years and within the vocabulary range of 12 to 32 words. 
Records of persons with age or vocabulary scores outside these ranges 
must be interpreted with caution. 

Though the interpolated tests (in the long form of the test) were 
included as validity indicators sensitive to poor cooperation, inattention, 











Clinical Use of Hunt-Minnesota Test 177 


excessive emotional disturbance, and the like, they have not proven 
sensitive enough to justify uncritical reliance. The records of persons 
failing three or more of the interpolated tests are usually invalid. On 
the other hand, a few persons may pass all of them and yet not be co- 
operating fully and attending adequately, the tests being too insensitive 
or too easy to detect this. The records of persons who manifest obvious 
inattention and resistance to testing are probably invalid or of doubtful 
validity regardless of their performance on the interpolated tests, particu- 
larly in cases showing agitation and depression. The examiner must be 
the judge of the valitidy of any given test record. 

The T score variably underestimates the degree of brain damage in 
persons who have been injured prior to attainment of intellectual ma- 
turity. For example, the test is not particularly applicable to cases of 
birth injury or other early or developmental defects. The T scores in 
cases of marked language handicap where the subject is equally or more 
familiar with a foreign language than with English also show a similar 
tendency. The records of all such persons must be interpreted with 
caution. 

Experience indicates that the test may be repeated at least once fol- 
lowing an interval of three or more weeks without appreciable practice 
effects, especially in working with brain damaged persons. 

The deterioration indicated by high T scores is predominately of 
organic origin. The reversibility of this deterioration depends upon the 
assumed underlying pathology rather than upon the test score. With 
regard to traumatic head injury, evidence is accumulating which suggests 
that the T score may decrease to a variable extent in some cases upon 
clinical recovery. Thus, upon recovery of cerebral physiologic equi- 
librium or upon remission of a possible “diaschisis’”’ effect, the T score 
may decrease, though it does not necessarily become normal. Present 
evidence also suggests that high T scores can occur in some cases of 
cerebral physiologic disturbance such as carbon monoxide poisoning or 
alcoholic intoxication. Remission of the disturbance apparently may be 
accompanied by at least partial reduction in the deterioration if no ex- 
tensive, concurrent brain damage has occurred. 

As is well known, extensive cerebral damage can result in intellectual 
deterioration without resulting in neurologic evidence of pathology, as in 
the early presenile dementias, bilateral prefrontal lobectomy and lo- 
botomy, or other “silent area” abnormality. On the other hand, small 
lesions in the spinal cord, midbrain, basal nuclei, or in the primary cortical 
projection areas can readily be detected by the neurological examination. 
But, since the mass of cortical tissue involved is small, these lesions are 








178 Howard F. Hunt 


often unaccompanied by intellectual deterioration, as in some cases of 
early multiple sclerosis or encephalitis. The presence of deterioration is, 
therefore, obviously not expected to be correlated perfectly with the pres- 
ence of pathological neurologic signs. Disagreement between neurologic 
examination findings and deterioration test results does not necessarily 
indicate error on the part of either method. 

The vocational and psychiatric prognosis associated with mild dete- 
rioration will depend upon the demands of the contemplated occupation 
and its familiarity to the patient, as well as upon his intellectual level and 
previous habits of adjustment. Several of our cases with mild deteriora- 
tion have been able to continue with familiar though relatively compli- 
cated occupations which demand a minimum of new learning, originality, 
-and emotional stability. On the other hand, the complaint in several of 
our deteriorated cases involved inability to perform previously easy and 
familiar tasks. In general, however, even mild deterioration is probably 
an unfavorable prognostic sign with regard to both vocational and 
psychiatric adjustment. 


Received February 7, 1944. 


References 


1. Hunt, Howard F., A practical, clinical test for organic brain damage. J. appl. 
Psychol., 1943, 27, 375-386. 

2. Hunt, Howard F., The Hunt-Minnesota test for organic brain damage. Minneapolis: 
The University of Minnesota Press, 1943, pp. 1-8. 








. 
18, 


zic 
ily 


on 
nd 
‘a- 
sli- 
iy, 


nd 


ly 
nd 


A Reply to Dr. Donald E. Super 


; John G. Darley 
NDRC Project SC-70, NS-146, Camp Murphy, Florida 


The editor of this Journal has permitted me to reply briefly to Dr. 
Super’s recent review of my book. There are essentially three aspects 
of a book review to which replies can fairly be made: aspects of disagree- 
ment over factual material and its interpretation; aspects of disagreement 
over emphases in content; and aspects of disagreement over viewpoint 
and end-purpose of the book. 

Dr. Super happily finds no cause for disagreement in regard to the 
factual material presented, but he apparently disagrees strongly with my 
emphases and the viewpoint I have tried to present. This is true to the 
extent of stating that guidance and counseling are misrepresented, “in 
line with two fallacies revealed in a number of the Minnesota publications.” 

The “fallacies” for which the Minnesota studies are thus stigmatized 
turn out to be no more than differences of opinion regarding the extent 
of the role of tests in clinical work and differences of opinion regarding the 
role to be played by the counselor. It is misleading to have such differ- 
ences of opinion distorted to the implied level of a factual or theoretical 
fallacy. This makes it appear that Dr. Super has evidence at least of 
the presumed fallacious nature of these emphases, if not evidence of the 
superiority of other methods. 

While this is not the place to enter into a generalized defense of the 
Minnesota writings, one fact needs to be reiterated. The excellent series 
of studies by Williamson and Bordin have demonstrated that the tech- 
niques developed at Minnesota produce improved adjustment among 
students; the critics of the Minnesota writings have maintained a thun- 
dering silence regarding these studies, while they persist in their denuncia- 
tion of an allegedly autocratic and undemocratic role imputed to the 
counselor. The reviewer even resorts to the classic critical technique of 
quoting out of context to buttress his point in this regard. The exponents 
of the transfer of learning theory in personal problem solving have yet to 
demonstrate their hypotheses; furthermore the extreme categories of auto- 
cratic and non-autocratic counseling are rhetorical constructs rather than 

1J. G. Darley, Testing and counseling in the high school guidance program. Chicago: 


Science Research Associates, 1943. Pp. 222. See D. E. Super’s review, J. appl. 
Psychol., 1943, 27, 546-548. 


179 




















180 John G. Darley 


demonstrable types, in view of individual differences among good coun- 
selors. 

It may be also that my attempt to write a self-contained introduction 
to counseling for non-psychologists was unwise. But in view of the fact 
that most high-school counselors are non-psychologists and will continue 
to be non-psychologists, psychologists have the option of translating their 
methods for this large group or of ignoring the group in the field. The 
exigencies of translation and the frequent misunderstandings of test 
techniques in this group as seen from the author’s experience combine 
to determine the emphases of the book. The experience of others might 
well produce different emphases. 

With the exception of his synonymous use of fallacies and differences 
of opinion, Dr. Super has written a careful and thoughtful review. It is 
to be hoped that both the critics and exponents of various forms of 
counseling may someday combine their efforts to design crucial experi- 
ments on the processes of therapy, and thus turn their analytical powers 
to the interpretation of research findings instead of philosophic debates 
involving alleged types of professional workers. 


Received January 28, 1944. 


Book Reviews 


Ordway, Samuel H., Jr. [Chairman]. Oral tests in public personnel selec- 
tion. A report submitted to the Civil Service Assembly by the Committee 
on Oral Tests in Public Personnel Selection. Chicago: Civil Service 
Assembly of the U.S. and Canada, 1943. Pp. xviii + 174. $3.00. 


This report by the Civil Service Assembly’s Committee on Oral Tests 
in Public Personnel Selection is one of a series of volumes dealing with the 
major phases of public personnel administration. Responsibility for the 
structure and fin-' form of the report was vested in the committee chair- 
man, Samuel H. Ordway, Jr., with major contributions by Louis J. 
Kroeger, William A. Hannig, Willard E. Parker, Walter V. Bingham, 
James C. O’Brien, Paul J. Kern, and Cyrus C. Perry. 

The purpose of the oral test in the selection process is to appraise those 
personal factors which affect job performance but which cannot be satis- 
factorily appraised in any other way. Interest, persistence, emotional 
stability, learning aptitude, and social adaptability are factors or traits 
that can be so appraised. ~Opportunity is also provided in the oral test 
for probing the work history as a measure of aptitude or ability, but the 
oral test process is time consuming and costly and should not be used 
where more objective or convenient techniques are available. 

The first step in an effective examining procedure is job analysis: the 
determination of qualifications, behavior patterns, characteristics, capaci- 
ties, etc., which are essential for successful performance on the job. The 
second step is the allocation to the oral test of those worker characteristics 
which can best be measured by this technique and the determination of 
the smallest component elements which can be rated. 

Evidence is educed in the oral test through statements, observations, 
and behavior patterns and this evidence to be conducive to objective 
rating must be “relevant,’”’ ‘“‘material,” and ‘‘reliable.’”’ Analogy is here 
made to the criteria of admissible evidence in a court of law. The rating 
process should be standardized; i.e., the rating scale, the factors rated, 
and the factor weights. The oral board should consist of (1) a specialist 
in oral examining on the staff of the personnel agency, (2) a representative 
of the employing agency, and (3) an expert in the occupational field. 
Hazards to be avoided by interviewers are ‘‘pseudo-scientific’’ methods 
of personality determination, halo effect and conditioning, generalization 
errors, exaggeration, fatigue, and the use of suggestion on the part of the 
interviewer. 


181 











182 Book Reviews 


For the purpose of appeal and review, and in fairness to candidates, an 
adequate record of the oral test should be kept. Procedures should be 
established for the filing of the appeal, the method of hearing, and the 
remedy for errors. Common errors in oral test procedure are failure to 
consider relevant and material evidence, faulty application of the rating 
scale, bias and favoritism, and irrelevant or insufficient standards for 
rating. The function of the court in review of oral examinations is to 
determine whether constitutional and statutory provisions have been 
complied with. 

“Sporadic efforts to evaluate the competitive oral test process in a 
small number of jurisdictions have disclosed little evidence of validity of 
the types of interview they use. However, this does not necessarily 
constitute a reflection on the worth of these processes. Very often the 
lack of evidence of validity springs from a related lack of objective yard- 
sticks for determining success on the job. . . . Conclusions drawn from 
such studies are largely inferential and this fact should be kept in mind 
when the findings are appraised.” 

Where a set of concepts and techniques are incapable of operational 
definition, or await such definition, their description and interpretation 
by a group, such as a committee, may be poorly integrated. Such failure 
of integration is evident in the present report. Such terms as probity, 
relevance, materiality, and trustworthiness are used in one section and 
correlation, validity, halo in another. While semantic and conceptual 
differences are to be regretted, more serious is the failure to reconcile 
divergent points of view as to what it is the oral test measures. At one 
point it measures command of language, clarity, and correctness of speech, 
ability to grasp the point of a question, etc.; at another, social adapt- 
ability, emotional stability, supervisory and administrative ability, etc. 
Present also to some degree in the point of view of the collaborators is the 
traditional conflict between the standardized, rigidly structured interview 
and the unstructured quasi-clinical interview. 

The amount of space allotted to the discussion of appeal and review 
procedure is disproportionate. Whereas two chapters or 20 pages are 
devoted to such a discussion, the general evaluation and validity of the 
interview is treated in one chapter of seven pages. While the problem of 
oral test review by courts and administrative bodies is an omnipresent 
one and a true examining problem, it should be remembered that as the 
validity of the oral test is increased by research and objectivity, the 
problem of appeals diminishes in importance. 

Of more serious import are the tendencies toward a “legalistic’’ 
validation of the oral test. While it is certainly true that evidence which 
is “material,’’ “relevant,” and “reliable” is the most valid kind of evidence 





Book Reviews 183 


upon which to determine ratings, logic or precedent, as in a court of law, 
cannot serve as criteria of what is “material,” ‘‘relevant,” and “reliable.” 
This may only be established by correlation of specific oral test functions 
with job criteria. A statistically valid oral test will pari passu be based 
upon “mterial,”’ “relevant,’”’ and “reliable” evidence. 

Documentation and reference to existing studies on the oral test are 
meager. For example, in the general discussion of the validity and relia- 
bility of the interview seven citations are made which, with two exceptions, 
were all published a decade or more ago. The contemporary industrial 
literature on the employment interview is satisfied by a citation to Viteles 
Industrial Psychology and no reference is made to such recent innovations 
as the “stress” interview, or Rogers’ phonographic methods as a tech- 
nique for training interviewers. The assumption in the book is unfounded 
that the oral test is valid until proven otherwise because studies demon- 
strating the lack of validity of the interview are based upon fallible 
criteria of job success. No measurement technique is objectively valid 
until demonstrated as such by scientific method. Lack of definitive tech- 
niques on the part of a discipline does not in any sense constitute scientific 
validation. 

Specifically commendable is the expressed and reiterated need for 
research on many points, e.g., the weighting of rating factors, the insist- 
ence upon a thorough understanding of the job before rating is attempted, 
and as a whole, the chapter on the training of interviewers. 

If evaluation of the report reveals that it falls short of being a defini- 
tive statement of the oral test, then one must hasten to point out that in 
an area where even fundamentals are nebulous and conflicting, the clari- 
fication of concepts and techniques even by approximation is most wel- 
come. If this report is instrumental in stimulating public personnel 
administrators to review the oral test practices in their jurisdiction, it is a 
job well done. 


Arthur Burton 
California State Personnel Board 


McNally H. J. The readability of certain type sizes and forms in sight- 
saving classes. New York: Teachers College, Columbia University 
Contributions to Education, No. 883, 1943. Pp. vi+ 71. $1.75. 


In the study here reported, an attempt was made to evaluate the read- 
ability of certain types and type sizes for children with handicapped 
vision. The typography employed was chosen partly in terms of practice 
in sight-saving classes, and partly from consultation with ‘professional 
authorities” in the field. The 6 printing arrangements used follow: 
12-point Caslon Bold No. 3, Linotype in a 24-pica line width and with 














184 Book Reviews 


4 points leading; 14-point Caslon Bold No. 3, Linotype in a 30-pica line 
width with 4 points leading; 18-point Caslon Bold No. 79, Monotype in 
a 36-pica line width and with 6 points leading; 24-point Caslon Bold No. 
79, Monotype in a 40-pica line width and with 6 points leading; typed 
and mimeographed materials closely resembled the set-up for the 24-point 
type. Six modified forms of Gates’ tests for measuring speed of reading 
constituted the textual material. Seventy-two children from the New 
York City schools were subjects. Illumination during the testing ranged 
from 27 to 35 foot-candles. 

Relative visibility of the samples (adult subjects) ranged from 47 to 
100 per cent. Speed of reading and rate of blinking were employed as 
criteria of readability. The experiment was designed so that analysis of 
variance could be used in treating the data. 

The analysis revealed no significant differences in either rate of reading 
or blink frequency for the various typographical variations. The speed 
criterion did reveal significantly large differences among test forms and 
order of testing. There is a suggestion that the larger types were better 
for the hyperopes. The author states that “there is evidence that the 
criteria—particularly the criteria of eye-blink frequency—did not measure 
adequately what differences there may have been.” 

This report is a good example of a carefully planned experiment that 
results in inconclusive findings. The author has recognized the major 
weaknesses involved: length of the reading period, differences among 
observers in counting eye-blinks, the question of reliability and validity 
of eye-blink frequency as a criterion of readability, and the need of more 
sensitive measures of visual effort expended. The author failed to em- 
phasize that a major consideration with visually handicapped children is 
ease and comfort rather than speed in reading. In this situation perhaps 
greater stress should be placed upon preference of the children for a certain 
typographical arrangement. Furthermore, a greater dependance upon 
word-form clues would lessen the need of fine discrimination in word 
perception. 

Miles A. Tinker 

University of Minnesota 


Stern, Edith M. Mental illness; a guide for the family. New York, 
The Commonwealth Fund, 1942, pp. xvi to 134. $1.00. 


This short volume written under sympathetic medical guidance and 
criticism is addressed to the relative of the mentally ill patient. It is 
simply and attractively written and answers in a direct, digestible and 
comforting manner most of the many questions which arise or lie disturb- 
ingly dormant in the minds of troubled relatives. It is a book which 





~~ sS Bt eee eS 


=) 


Book Reviews 185 


many physicians, ministers, psychologists, and social workers will want 
to hand to their clients and constituents when pertinent problems arise. 

Each chapter is a unit. Each deals with a specific issue such as: why 
hospitalize? private or public hospitals? getting the patient admitted, 
taking the patient to the hospital, leaving him there, the first month in 
the hospital, létters and visits, discharge, when the patient comes home, 
and the permanence of recovery. Anyone who has experienced these 
issues with a relative can visualize the pages of this book being read, 
reread, and to a certain extent heeded during the early weeks of the crisis. 
Too often therapy for the well parent or kinsman has been neglected. He 
has been left to discover his own verbalizations and arrangements which 
are often inimical to the complete recovery of the patient who is to return 
to the home. Mrs. Stern’s elementary manual is therefore a real con- 
tribution to psychotherapy. 

Individuality of personality, the relative nature of mental illness, and 
the psychogenic influences in most psychoses, are recognized and well 
presented. The author in true hygienic form is realistic, faces with the 
reader the many problems which indubitably will arise and offers specific 
adaptive suggestions. The attitudes of relatives and the people down the 
street toward mental disorders are constantly mentioned, explanations of 
them as well as methods of coping with them are given. In line with this 
emphasis the reader is informed about the mental hygiene movement and 
is urged to support it. A glossary of commonly used technical terms 
and a section dealing with forms of therapy are part of this theme. 

Although the author does not overlook the too numerous cases in 
which legal provisions for the men _ally diseased are antiquated and exist- 
ing hospitals fall short of the ideal, the kinsman will feel that, despite 
these limitations, there are therapeutic provisions for his relative close 
at hand. 

The professional worker will appreciate the appendices that name the 
states which oversee private institutions, give the various methods of 
admission to public hospitals, and list states which have social workers, 
legal provisions for family care and mental hygiene societies. This 
meaty little book would have been improved by the inclusion of an index 
even though its 134 pages are divided into 18 chapters and five appendices. 

Fred McKinney 


University of Missouri 














New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should 
be sent to Donald G. Paterson, Editor, Department of Psychology, 
University of Minnesota, Minneapolis 14, Minnesota 


Detroit retail selling inventory. Harry J. Baker and Paul H. Voelker. 
Bloomington, Ill.: Public School Publishing Co., 1940. Sample set, 
30 cents. 

A visual motor gestalt test and its clinical use. Lauretta Bender. Ameri- 
can Orthopsychiatric Association, 1790 Broadway, Room 916, New 
York 19, N. Y., 1944. Pp. ix +176. $3.50. 

Practical psychology. F.K. Berrien. New York: The Macmillan Com- 
pany, March 1944. Pp. 660. 

“Psychosomatic disturbances in relation to personnel selection.” 
Lawrence K. Frank, Carl Binger et al. Annals of The New York 
Academy of Sciences. Vol. XLIV, Article 6, pp. 539-622, 1943. 
$1.00. 

Education and health of the partially seeing child. Winifred Hathaway. 
New York: Columbia University Press, 1943. Pp. 216. $2.50. 

Doll play of Pilagéd Indian children. Jules and Zunia Henry. American 
Orthopsychiatric Association, 1790 Broadway, Room 916, New York 
19, N. Y., 1944. $3.50. 

Leta S. Hollingworth. Harry L. Hollingworth. Lincoln: The University 
of Nebraska Press, 1943. Pp. 204. 

Behavior and neurosis. Jules H. Masserman. Chicago: The University 
of Chicago Press, 1943. Pp. xv + 266. $3.00. 

The rights of infants. Margaret A. Ribble. New York: Columbia Uni- 
versity Press, 1943. Pp. xii +112. $1.75. 

Tall men have their problems too. Francis B. Riggs, 21 Coolidge Hill 
Road, Cambridge 38, Massachusetts. Pp. 147. $1.00 (privately 
published). 

Child psychology. Skinner & Harriman. New York: Macmillan Com- 
pany, 1943. Pp. 522. $3.00. 

The march of medicine. Number VIII of the New York Academy of 
Medicine Lectures to the Laity, 1943. New York: Columbia Uni- 
versity Press, 1943. Pp. x + 158. $2.00. 

Meeting children’s emotional disorders at school. U.S. Office of Education. 
Washington, D. C.: Superintendent of Documents, Government 
Printing Office. (School Children and the War Series, Leaflet No. 6, 


pp. 15.) $.05. 
186 


New Books, Monographs, and Pamphlets 187 


J wenile delinquency and the schools in wartime. U.S. Office of Education. 
Washington, D. C.: Superintendent of Documents, Government 
Printing Office. (School Children and the War Series, Leaflet No. 8.) 
$.10. 

Schools and classes for exceptional children: The child with impaired hearing. 
Los Angeles City School District School Publication No. 391, 1943. 
Pp. 48. 

Statistical abstract of the United States, 1942. Washington, D. C.: Super- 
intendent of Documents, Government Printing Office. $1.75. 
































AMERICAN PSYCHOLOGICAL PERIODICALS 





lof I N. Sui . 4 4 
Amesitzalted by Ke. M- Daletbech, Mad laon Bentley, and 1. C. ym Fl am, pole pone 
mental psychology. Founded 188 


ournal a 1 
J (volume), 1800 pages ann ted'b ed by Cart Murchison, Press. how pg So 


ree sabesiption — mauler Dated ‘by Beckart A Longa Biacomie Coe 


Monographs—Northwestern University Sinai: Ammarionn Pevetestagice| Association, 
I Subscription $6.00 volume. ohn F. Dashiell. Wi: t s 
each heuet ould or shore SunanGnEE. oan oT vs = 
tastes = Soe ong orthwestern University, Evanston, Illinois; American Psychological Aapeciition. Inc. 
Ss nee 1 heey 665 pean, 2s org Edited by John E. Anderson. Monthly (10 numbers). 


Archives of ‘New York, N. Y.; Columbia Universi $6.00 — 
— y R.S. Woodworth. Wilkout fend deben, ouch canber eapedinenna study. Found 


Jourunl of Ahosunet ane and Social orthwestern University, Evanston, Illinois; American 
1 Associa’ Inc. 00. S60 x 4 
ogical : — am, $5 pages annually. Edited by Gordon W. seers 
Joumm of Béuccticnst | Petia, MA; Wervick & Yue. Coen 5560.-¢ 20 pages 
. E 
cmenty. | nod, We + Dunlap, P. M. Symonds, and Jones. Monthly except June to 


Review—New VY: N. Y.; 64 West 56th St. Subscription $6.00. 500 pages annuall 
mts, Meron Yate N.¥.,64 8 Founded 1913. <i 


5 Assocation, tne: I $14.00 an tik adel ae a. by 
nc. annum 
Samuel W. Ferberger. ‘Monthly, “Founded 1916. 
Journal  Aontied <a ie als paeonnee Tilinois; ape Perchledienl Gave 
ciation, ine yr $6.00. eee Edited by Decal Paterson. -monthly. 


of Comparative Psychology—Baltimore, Md.; Williams & Wilkins Co. Subscri 7.00 
PE ae ts ot ae Ee ae een ee en ee 


Comparative Psychology Battoose, 266.4 Williams & Wilkins Co. Subscription $6.00 
ae. aes M. Dorcus. Without fixed dates, each number a ee eee eh. 


Conse Pee ited by ‘Carl Murchison, Provincetown, Mass.; The Journal Press. Subscription $7.00. 500 
Each ber ¥ 
Bi we i by tae ——— Bi-monthly. . num  -imananciter temas Citid 


PerchlogealAbtrcte—N 7 ha Fey - Tilinois; American Association, Inc. 
Subscription $7.00. 7.00700 pagewannualy oem et ea .L. Ansbacher. Monthly. 


Ss ed measles aha ournal Press. Subscription $14.00 per annum 
2 volumes). pages annually. Edited Carl Murchison. Quarterly. Experimen theo- 
Sat pane Fw Ee penne Ba od By Car 1927, a 


‘ournal of Social Provincetown, Mass.; The Journal Press. Subscription $7.00. 

J : . ited by chm Dewey and Cait Murchison Quarterly. Polen cecil ood Gitereotinl 
psychology. F 

Psychoanalytic Quarterly—. ,N. bE age td ce maga $6.00. 560 pages annually. 
Edited by Bertram D. aad athen. ‘ounded 1932. r 


Character Depa H.C. Press. 20. 360 

er fs ag 45 peer Subscription $2 pages annvally. 
Jounnel ot Fs Provincetown, Mass. Subscription $14.00 per annum (2 volumes 
* pages annually. Edited by eS, Quarterly. Founded : ounded 1936. ; Ps 


Psychometrika—University , Chicago, Ill.; Psychometric . Subscription wy pages 
msec a oe Frama a: von: manger A a eat ma 


Ind.; 4 : . Edited 
Pei 1. Kantor and CM Loutt ines Pingel Some. Suleeesighioe $4.00. 500 tam anamaity 


by J .R. Kantor and C. M dates, each number a 
psychology. Founded 193 


naar sear bg rary Bo Subscription $3.00. 240 pages 


ete 


ea 








